Regarding Round Robin partition

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
soumya5891
Participant
Posts: 152
Joined: Mon Mar 07, 2011 6:16 am

Regarding Round Robin partition

Post by soumya5891 »

I tried to generate unique number (in sequence) without any breakings between them So I have applied the following formula in a transformer where only this operation is happening.

Formula : @INROWNUM * @NUMPARTITIONS + @PARTITIONNUM - @NUMPARTITIONS - 1


And also making the input partition to the transformer is Round Robin.Datastage is running on forur node config.

But I'm getting some breakings towards the end of of the sequence. Let say if the sequence starts from 1 and ends at 30 then upto 25 it generates properly and after that it generates 27 like that.

When I tried to print the @PARTITION NUM of each record in the dataset it shows the round robin partitioning is not working properly.

Let assume for 27 no of records

Partition 0 : 8 records
Partition 1 : 8 records
Partition 2 : 6 records
Partition 3 : 5 records


And this is happenings when the input to the transformer is round robin partitioned.

But as per round robin method the records should be distributed in the following way.


Partition 0 : 7 records
Partition 1 : 7 records
Partition 2 : 7 records
Partition 3 : 6 records
Soumya
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

The only way you will guarantee that you have no breaks between assigned sequence numbers is to assign them in a single partition (i.e. running your job or transformer sequentially), or to assign them using with the Surrogate Key generator with a block size of 1. This has been discussed many times before in the forum.

Round Robin does not guarantee an absolutely even distribution such as you are expecting, especially with such a small sample of data. If the upstream stage is running parallel, the round robin results will depend upon how the data was already distributed prior to repartitioning with round robin (each logical partition runs it's own round robin partitioner).

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
ssnegi
Participant
Posts: 138
Joined: Thu Nov 15, 2007 4:17 am
Location: Sydney, Australia

Surrogate Key in Transformer

Post by ssnegi »

Transformer : Partition Round Robin. Select MAX surrogate key value from table.

Then in stage variable put following derivation :
NullToZero(InputColumn.MAX) + (@NUMPARTITIONS*(@INROWNUM-1)) + @PARTITIONNUM + 1
Last edited by ssnegi on Thu Feb 27, 2014 10:20 pm, edited 1 time in total.
prasson_ibm
Premium Member
Premium Member
Posts: 536
Joined: Thu Oct 11, 2007 1:48 am
Location: Bangalore

Post by prasson_ibm »

Hi,
If upstream operator is running sequentialy and transformer round robin,then this formula will generate sequence number without any gap.
Post Reply