Regarding Array Size

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
manojbh31
Premium Member
Premium Member
Posts: 83
Joined: Thu Jun 21, 2007 6:41 am

Regarding Array Size

Post by manojbh31 »

Hi

Happy New Year to All.

I want to set Array size to my project, on wat basis this can be achived.
Please let me because performance very low.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

:? Array Size is neither a project level variable nor a magic bullet for performance issues. You'll need to get more specific if you want specific help.
-craig

"You can never have too many knives" -- Logan Nine Fingers
manojbh31
Premium Member
Premium Member
Posts: 83
Joined: Thu Jun 21, 2007 6:41 am

Post by manojbh31 »

Ok. Is there any particular Criteria to define Array Size? In previous company array size was 32767. On what basis this value is defined?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Sheesh... that is a ridiculous value to use and falls into the 'bigger must be better' camp. Typically, you need to be aware of your network packet size and average record length then adjust your Array Size accordingly.

In other words, there is no one magic value - it will vary from job to job. You'll need to experiment to find the 'sweet spot' for any particular job, anything too high will actually hinder performance.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Approximately (packet_size / row_size) or a small multiple thereof should be your starting point for experimentation.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
RodBarnes
Charter Member
Charter Member
Posts: 182
Joined: Fri Mar 18, 2005 2:10 pm

Post by RodBarnes »

ray.wurlod wrote:Approximately (packet_size / row_size) or a small multiple thereof should be your starting point for experimentation.
This makes sense. However, if the rows are typically larger than the packet size, each row will always be fragmented, and so the packet size really becomes no longer part of the equation. Do you agree?

It seems that with very large rows (e.g.; a row with large text values), the array size is more a factor of the memory on the DS server and the buffer size to ensure the data doesn't overstep the buffer/memory by array_size * row_size.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Agree.

In that case something like (small_multiple_of_packet_size / row_size) ought to be close.

Trial and error seems to be necessary (making sure that all other variabilities are reasonably controlled for), and it's not even a smooth curve - some experiments have indicated a multi-nodal curve, then time for experimentation was curtailed.

I wonder if IBM pays anyone to research these kinds of things?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
manojbh31
Premium Member
Premium Member
Posts: 83
Joined: Thu Jun 21, 2007 6:41 am

Re: Regarding Array Size

Post by manojbh31 »

Hi can you please tell me what is packet size where can we find the packet size
RodBarnes
Charter Member
Charter Member
Posts: 182
Joined: Fri Mar 18, 2005 2:10 pm

Re: Regarding Array Size

Post by RodBarnes »

manojbh31 wrote:Hi can you please tell me what is packet size where can we find the packet size
Typically, ethernet packetsize will be 1500 or less. You can use the command "ping <target> -f -l <size>" in a repetitive fashion to find out what the current size really is. If the specified size is larger than the current packetsize, the system will respond with: "Packet needs to be fragmented...". Reduce the size value and run the command again. Repeat this until it responds with a normal ping reply; e.g., "Reply from <target>: bytes=<size>..."
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I would think you should be able to ask your Network peoples.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply