I am using 3 different stage variables in transformer with initial values say 1,2,3 respectively and then incrementing them by 1. This output is going to 3 different output files.
For these 3 files , I shud get output as
First file - 1,2,3,4
Second file - 2,3,4,5
Third File - 3,4,5,6
I am getting this output when running my job on 1 node configuration.
But when I run it on 2 node configuration it gives output as -
First file - 3,3,3,3
Second file - 2,2,3,3
Third File - 3,3,4,4
I don't want my output to change to change when I am using different node configuration files
Please help
Different output with different nodes configuration files
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Identify (from the score) exactly what partitioning is being used at each stage in your job. Post that information here. Without it it is not possible to provide cogent advice.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 5
- Joined: Sun Jun 25, 2006 12:31 am
- Location: Melbourne, Australia
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
There is a thread in the FAQ forum on implementing a counter in a Transformer of a parallel job. You are safer referring to the special macros for parallel jobs @NUMPARTITIONS and @PARTITIONNUM. You set the three stage variable starting values to:
@PARTITIONNUM - @NUMPARTITIONS +1
@PARTITIONNUM - @NUMPARTITIONS +2
@PARTITIONNUM - @NUMPARTITIONS +3
Then increment each variable by @PARTITIONNUM:
StageVar1 = StageVar1 + @PARTITIONNUM
This should give you three stage variables starting at 1, 2 and 3 and delivering unique numbers across the partitions. With round robin partitioning I think this will deliver numbers in sequence but there is a chance numbers will output out of sequence (if one partition is faster than the others) and there is also a chance that some numbers will be skipped at the end of a dataset if partitions are not balanced. You should test it to see what happens.
@PARTITIONNUM - @NUMPARTITIONS +1
@PARTITIONNUM - @NUMPARTITIONS +2
@PARTITIONNUM - @NUMPARTITIONS +3
Then increment each variable by @PARTITIONNUM:
StageVar1 = StageVar1 + @PARTITIONNUM
This should give you three stage variables starting at 1, 2 and 3 and delivering unique numbers across the partitions. With round robin partitioning I think this will deliver numbers in sequence but there is a chance numbers will output out of sequence (if one partition is faster than the others) and there is also a chance that some numbers will be skipped at the end of a dataset if partitions are not balanced. You should test it to see what happens.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn