Some partition doubts

ashish10mca · Post by **ashish10mca** » Mon Jan 24, 2011 2:09 am

Doubt 1: entire partition if uses and record spooled as it is in partition
then we get 4 times of source record(If config file is of 4 nodes).
right or wrong??

Doubt 2: Same if uses in first stage then wht would be the default partition.(Because same follows the partition strategy of upcoming stage and if it spplied in first stage the for which partition datastage would go for)??

Doubt 3: Differnce between round robin and random partition(Means random partition known to distribute records in random manner amongest nodes then how random partition manages load balencing)??

srinivas.g · Post by **srinivas.g** » Mon Jan 24, 2011 7:26 am

1. Yes. if reference link is having 100 records then each partition having 100 records.

2. same partition means it will take previous partition.
3. yes

ray.wurlod · Post by **ray.wurlod** » Mon Jan 24, 2011 2:07 pm

3. It doesn't. It's random. But, for a large enough number of rows, random distribution will be close enough to 1/N rows per node.

abc123 · Post by **abc123** » Thu Jan 27, 2011 5:11 pm

1. I think your question is, do the incoming rows in a stage with entire partitioning set, get quadrupled in a 4 node configuration? That is, all rows go into each output partition.

Answer is yes.

ThilSe · Post by **ThilSe** » Fri Feb 04, 2011 2:31 pm

2. Selecting 'same' partition in source/output link will try to use the same partition in the source dataset created by a prior job - avoids repartitioning of data. Though 'same' can be used wtih source database stages, we need to be careful when data is read in parallel from a partitioned DB2 (and i guess in oracle also) table (for eg. using 'current node' clause).

Thanks,
Senthil

datastagesandeep · Post by **datastagesandeep** » Sat Feb 05, 2011 1:38 am

ANSWER: Round Robin and Random are different itself in their distribution.
Lets take one example.
If My data is =(5,8,3,9,4,6,7,5,8,12,45,98,36,14)

Roundrobin for three nodes will be:
First: (5,9,7,12,36)
Second:(8,4,5,45,14)
Third:(3,6,8,98)

But Random could be: (this is one of the possible way)
First: (5,14,5,4,98)
Second:(8,12,7,45,9)
Third:(3,6,8,36)

Hope will be helpful

PhilHibbs · Post by **PhilHibbs** » Mon Aug 20, 2012 8:46 am

datastagesandeep wrote:But Random could be: (this is one of the possible way)
First: (5,14,5,4,98)
Second:(8,12,7,45,9)
Third:(3,6,8,36)

Sorry to reply to such an old post but this needs to be corrected. That is NOT possible - since 14 is the last input value, it can only ever be the last value out of whichever node it goes to. It can't turn up as the second value in a node, unless that node only received 2 values.

Unless there is also a sort happening on some other unseen value.

DSXchange

Some partition doubts

Some partition doubts

Re: Some partition doubts

Re: Some partition doubts