What is the method of partitioning ideal for my JOB?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Round Robin will preserve the generated order. However, they will be spread over the nodes mentioned in your configuration file. For example if you have four nodes your Data Set will receive:

Code: Select all

node #0:  1  5  9  13  ...
node #1:  2  6  10  14  ...
node #2:  3  7  11  15  ...
node #3:  4  8  12  16  ...
When you use View Data to look at these the order may be scrambled - it depends on which node is able to report first. But the rows in the Data Set will be as I have described.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
bkumar103
Participant
Posts: 214
Joined: Wed Jul 25, 2007 2:29 am
Location: Chennai

Post by bkumar103 »

There is no hard and fast rule which decides which partition you have to use except for some specific stages like entire for lookup, Hash for aggregator, sort etc. Its all depends on you requirement and the input data set you get. Based on the input data set you get and the stages you use, you have to take the call on partioning. Auto is the default in most case which leaves the partitioning option to the DSEngine.
bkumar103
Participant
Posts: 214
Joined: Wed Jul 25, 2007 2:29 am
Location: Chennai

Post by bkumar103 »

There is no hard and fast rule which decides which partition you have to use except for some specific stages like entire for lookup, Hash for aggregator, sort etc. Its all depends on you requirement and the input data set you get. Based on the input data set you get and the stages you use, you have to take the call on partioning. Auto is the default in most case which leaves the partitioning option to the DSEngine.
sunayan_pal
Participant
Posts: 49
Joined: Fri May 11, 2007 12:24 am
Location: kolkata

Post by sunayan_pal »

what happen in case when required to do operation to database.
let say if the database is "Teradata" and if the partition is set to Auto then how do DSEngine will decide as because the information about the database activities like collect statistics or join index had been performed or not.
regards
sunayan
Post Reply