Page 1 of 1

Divide records to links

Posted: Wed Oct 07, 2015 5:17 am
by AhmedSamir
hi there,

I want to divide the records volum to be divide to the links

ex :

sequ_file 15 record ----- > ????? which stage ------- > Link 1 5 record
------- > Link 2 5 record
------- > Link 3 5 record

thank you

Posted: Wed Oct 07, 2015 5:34 am
by ShaneMuir
Do you have any criteria for splitting your records, or do you just want them split equally?

What is the target after the split? Are you processing each stream of data separately afterwards? Ie what are you trying to achieve?

Posted: Wed Oct 07, 2015 5:45 am
by AhmedSamir
thank you ShaneMuir

1- just want them split equally

2- just split only

Posted: Wed Oct 07, 2015 6:34 am
by ShaneMuir
Are you using multiple nodes to process?

If you just want to split them evenly then I would use the Mod function on the Input Row number in a transformer stage variable. You can then equate the constraint on each output link accordingly.

eg set stage variable svOutputLink as

Code: Select all

Mod(@INROWNUM,3)
Set output link constraints:
Link1: svOutputLink=0
Link2: svOutputLink=1
Link3: svOutputLink=2

Note that this will split the data evenly between 3 output datastreams. The above logic assumes using one node.

There might be other solutions depending on why you want to split the data (ie you might be just able to run a multi node configuration whereby you can have a single output link but that link is split into 3 processing streams)

Posted: Wed Oct 07, 2015 6:42 am
by AhmedSamir
thx .. but there is no stage for doing this case ..

Posted: Wed Oct 07, 2015 7:15 am
by ShaneMuir
Of course there is a stage - the transformer stage.

There is also a switch stage - but that would also use the same criteria as a transformer stage (I think).

Then again - I cannot think of a reason why you would want to split your input rows without some sort of criteria - for me, splitting of rows is usually because you want to isolate some records for different processing from the other records.

Posted: Wed Oct 07, 2015 7:27 am
by chulett
You could also leverage "round robin" as a partitioning method if you wanted the split to be across all nodes, be it three or whatever.

Posted: Wed Oct 07, 2015 7:50 am
by ShaneMuir
chulett wrote:You could also leverage "round robin" as a partitioning method if you wanted the split to be across all nodes, be it three or whatever.
I was going to suggest this - but the OP seemed to want the data in 3 distinct links rather than 3 nodes - this is why I am asking why the data is being split. If its just 3 different partitions of data for parallel processing then, yes use round robin partitioning.