Divide records to links

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
AhmedSamir
Participant
Posts: 10
Joined: Mon Dec 15, 2014 8:33 am

Divide records to links

Post by AhmedSamir »

hi there,

I want to divide the records volum to be divide to the links

ex :

sequ_file 15 record ----- > ????? which stage ------- > Link 1 5 record
------- > Link 2 5 record
------- > Link 3 5 record

thank you
ShaneMuir
Premium Member
Premium Member
Posts: 508
Joined: Tue Jun 15, 2004 5:00 am
Location: London

Post by ShaneMuir »

Do you have any criteria for splitting your records, or do you just want them split equally?

What is the target after the split? Are you processing each stream of data separately afterwards? Ie what are you trying to achieve?
AhmedSamir
Participant
Posts: 10
Joined: Mon Dec 15, 2014 8:33 am

Post by AhmedSamir »

thank you ShaneMuir

1- just want them split equally

2- just split only
ShaneMuir
Premium Member
Premium Member
Posts: 508
Joined: Tue Jun 15, 2004 5:00 am
Location: London

Post by ShaneMuir »

Are you using multiple nodes to process?

If you just want to split them evenly then I would use the Mod function on the Input Row number in a transformer stage variable. You can then equate the constraint on each output link accordingly.

eg set stage variable svOutputLink as

Code: Select all

Mod(@INROWNUM,3)
Set output link constraints:
Link1: svOutputLink=0
Link2: svOutputLink=1
Link3: svOutputLink=2

Note that this will split the data evenly between 3 output datastreams. The above logic assumes using one node.

There might be other solutions depending on why you want to split the data (ie you might be just able to run a multi node configuration whereby you can have a single output link but that link is split into 3 processing streams)
AhmedSamir
Participant
Posts: 10
Joined: Mon Dec 15, 2014 8:33 am

Post by AhmedSamir »

thx .. but there is no stage for doing this case ..
ShaneMuir
Premium Member
Premium Member
Posts: 508
Joined: Tue Jun 15, 2004 5:00 am
Location: London

Post by ShaneMuir »

Of course there is a stage - the transformer stage.

There is also a switch stage - but that would also use the same criteria as a transformer stage (I think).

Then again - I cannot think of a reason why you would want to split your input rows without some sort of criteria - for me, splitting of rows is usually because you want to isolate some records for different processing from the other records.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You could also leverage "round robin" as a partitioning method if you wanted the split to be across all nodes, be it three or whatever.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ShaneMuir
Premium Member
Premium Member
Posts: 508
Joined: Tue Jun 15, 2004 5:00 am
Location: London

Post by ShaneMuir »

chulett wrote:You could also leverage "round robin" as a partitioning method if you wanted the split to be across all nodes, be it three or whatever.
I was going to suggest this - but the OP seemed to want the data in 3 distinct links rather than 3 nodes - this is why I am asking why the data is being split. If its just 3 different partitions of data for parallel processing then, yes use round robin partitioning.
Post Reply