Array Size Impact on Column Length :(

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
badhri
Participant
Posts: 42
Joined: Tue Mar 19, 2002 8:15 pm

Array Size Impact on Column Length :(

Post by badhri »

Greetings,

I'm facing a weird issue and need your help in this regard.

I have a one-to-one mapping using OCI9 plugin and one of the column in these tables is of size VARCHAR2 (4000). When I run the mapping with the Array size of 1000 I find NULL value updated in the target table instead of the column value. If I replace the 1000 by 1 then I get the right values in the Target table.

Any idea why this is happening and how to rectify this.

I use DataStage Version 6.0.1 in Sun Solaris 2.8.

Thanks in Advance.

Regards,
Badhri ...

Badhrinath Krishnamoorthy
www.cognizant.com
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Badhri

I think nobody is answering this question because there should be no relationship between array size and this field being populated except for rollback segments. You are creating a large rollback segment and maybe running out of space on the target database.

Your array size buffers commits. Lower from 1000 and try it. Let us know if something in between 1 and 1000 works. I think you are on the right track. Keep playing with it.

Kim.

Kim Duke
DwNav - ETL Navigator
www.Duke-Consulting.com
badhri
Participant
Posts: 42
Joined: Tue Mar 19, 2002 8:15 pm

Post by badhri »

Thanks Kim.

Alas a reply that really made me happy [:I]. I will try with some array size less than 1000 and see what is the threshold.

BTW is there anywhere I can learn more on this Array size parameters internal functionality to get a better insight on this.

Thanks,
Badhri ...

Badhrinath Krishnamoorthy
www.cognizant.com
debajitp
Participant
Posts: 7
Joined: Wed Jun 11, 2003 6:15 am
Location: India
Contact:

Post by debajitp »

Hi Badhri,

I am also trying to figure out the exact usability of the Array size property. What I feel till now is that we should calculate the no of bytes for a row and by seeing the network bandwidth between Oracle server and the Datastage server, try to reach to one numerical figure which will be the number of rows that can be sent over the network at a time.

.. and if this is the case, then how the 'rollback segment' size comes into play ?


Can u pls share ur findings ?

Also, what is the difference between 'transaction size' and 'Rows per transaction' attributes ? or they are the same ?



D Paul
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

D Paul

I think that is a very good idea. There should be some kind of relationship between average row length and number of rows and packet size. We used to do performance tuning on database based on buffer size of a filesystem. I always thought bytes/second is a better performance number than rows/second.

Kim.

Kim Duke
DsWebMon - Monitor over the web
www.Duke-Consulting.com
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

D Paul,

On the subject of 'Transaction size' versus 'Rows per transaction' - the 'Transaction size' box on the General tab is obsolete and ignored, and is only maintained for backwards-compatability when upgrading jobs from an older version of DataStage. The only place that matters in the new OCI stages is the 'Rows per transaction' input on the Transaction Handling tab. This is documented in the online help for the General tab of the OCI8/9 stages.

-craig
debajitp
Participant
Posts: 7
Joined: Wed Jun 11, 2003 6:15 am
Location: India
Contact:

Post by debajitp »

Thanks Craig.

Should I go forward with the way as anticipated by me .. for fixing the array size ? If any of you have any authoratitive answer pls do let me know.

Thanks a lot in advance.

D Paul
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Array size is always tricky, and I don't think you can ever find a 'golden' value that is exactly right for a particular job - only one that give a good performance improvement without introducing... quirks. [:)] As mentioned, keep in mind that the job will be attempting to send, across the network all at once, an (maximum) amount of data equal to the product of your row length x the array size. Technically, *average* row length, but we won't go into that.

Kim's comment "Your array size buffers commits" is a little mis-leading. Yes and No. When you are reading from Oracle, then it is as simple as noted above - X amount of rows come over the network in a 'packet' at a time to feed your job. When writing, it gets a little more complicated. What you are buffering to Oracle are the DML statements, the inserts/updates/deletes, and sending a 'packet' of them to Oracle based on the Array Size. Committing is related, but controlled by the Transaction Size parameter, not the Array Size. Odd combinations of the two can cause weird, hard to diagnose problems. I'd suggest keeping one as a multiple of the other, but (like you found on the Array Size) don't go over-board on the sizing. Simple example: an Array Size of 10 with a Transaction Size of 100. One last bit - I for one would never set the Array Size bigger than the Transaction Size when Transaction Size is greater than 1.

Hope this helps,

-craig
Post Reply