PART and PARTCOUNT

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Vasanth
Participant
Posts: 10
Joined: Tue Apr 10, 2007 1:37 am
Location: M nagar

PART and PARTCOUNT

Post by Vasanth »

Can anyone explain how PART and PARTCOUNT in Row Generator stage actually working at node level (4, 8 etc.)?

Also, please do explain What is PART and PARTCOUNT?

Thanks in advance,
Vasanth
Vasanth
devidotcom
Participant
Posts: 247
Joined: Thu Apr 27, 2006 6:38 am
Location: Hyderabad

Post by devidotcom »

I gues PART would be in which partition the data is in and PARTCOUNT is the number of partition on which the job is running..
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

By default the Row Generator stage operates in sequential mode. You would have to set it to execute in parallel mode for these values to make any sense. PART is the number of the partition on which a particular process is executing (starting from zero), while PARTCOUNT is the number of partitions presently being used - governed by the choice of configuration file through APT_CONFIG_FILE environment variable.

Therefore, if you had four nodes, PART would be 0, 1, 2 or 3 and PARTCOUNT would be 4.

If you had eight nodes, PART would be 0, 1, 2, 3, 4, 5, 6 or 7 and PARTCOUNT would be 8.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Vasanth
Participant
Posts: 10
Joined: Tue Apr 10, 2007 1:37 am
Location: M nagar

Post by Vasanth »

Rey,

May i know how node works?

Are nodes being used only to carry data or it performs any actual calculation (say addition for example)?

Suppose, i have a sequence row generator stage with initial value = 0 and increment = 15 and run the job in 2-node config. How the first node takes value 0 and second node generates 15. Where the actual increment processing happens?

If the node being used to carry data then how parallelism acheived here?

Thanks in advance,
Vasanth
Vasanth
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If you hard code the initial value as 0 and the increment as 15 then each of your processing nodes (mentioned in the configuration files) will have the same sequence generated, namely 0,15,30,45,...

This is probably not what you want. If you set the initial value to PART and the increment to PARTCOUNT and you have four nodes, then you will get the following sequences generated in parallel execution mode:
node #0: 0,4,8,12,...
node #1: 1,5,9,13,...
node #2: 2,6,10,14,...
node #3: 3,7,11,15,...
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply