Preferred value for 'Insert Array Size'

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
new_dsuser
Participant
Posts: 5
Joined: Sat Apr 17, 2010 10:56 am

Preferred value for 'Insert Array Size'

Post by new_dsuser »

Hello,

I'm working on some parallel jobs that load Oracle 11g tables.
I'm using the User Defined Update/Insert option.
I was wondering if there is any preferred value for the 'Insert Array Size' parameter. By default this has been set to 500. I see that if I increase this, the job runs faster because of the reduced number of DB hits. I'm thinking of setting this to 2000 at least.

Can anyone please shed some light on what's the best practice in terms of setting this value?

Thanks!
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There are three considerations.

One is your commit interval. Array size should not be larger than commit interval, and rows/transaction should ideally be a whole multiple of array size.

Second is related. If you are using auto-commit, or if you require accurate reporting of row errors, you need to specify array size 1.

Third is what array size is meant to do. DataStage will send/receive arrays of values. These should fit nicely into network packets. So the "best" value will be a multiple of row size that fits neatly into one or a few network packets, so that not too much space is wasted in network transport.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply