Different output with different nodes configuration files

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Anoop3d
Participant
Posts: 26
Joined: Wed Apr 26, 2006 10:53 pm
Location: san francisco

Different output with different nodes configuration files

Post by Anoop3d »

I am using 3 different stage variables in transformer with initial values say 1,2,3 respectively and then incrementing them by 1. This output is going to 3 different output files.
For these 3 files , I shud get output as
First file - 1,2,3,4
Second file - 2,3,4,5
Third File - 3,4,5,6
I am getting this output when running my job on 1 node configuration.
But when I run it on 2 node configuration it gives output as -
First file - 3,3,3,3
Second file - 2,2,3,3
Third File - 3,3,4,4
I don't want my output to change to change when I am using different node configuration files
Please help
Ankush
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Identify (from the score) exactly what partitioning is being used at each stage in your job. Post that information here. Without it it is not possible to provide cogent advice.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
shankar_iyer
Participant
Posts: 5
Joined: Sun Jun 25, 2006 12:31 am
Location: Melbourne, Australia

Post by shankar_iyer »

Along with partitioning, you should also mention the constraints of your each output link if any.
Shankar Iyer
Business Analyst
Hewett Packard
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

There is a thread in the FAQ forum on implementing a counter in a Transformer of a parallel job. You are safer referring to the special macros for parallel jobs @NUMPARTITIONS and @PARTITIONNUM. You set the three stage variable starting values to:
@PARTITIONNUM - @NUMPARTITIONS +1
@PARTITIONNUM - @NUMPARTITIONS +2
@PARTITIONNUM - @NUMPARTITIONS +3

Then increment each variable by @PARTITIONNUM:
StageVar1 = StageVar1 + @PARTITIONNUM

This should give you three stage variables starting at 1, 2 and 3 and delivering unique numbers across the partitions. With round robin partitioning I think this will deliver numbers in sequence but there is a chance numbers will output out of sequence (if one partition is faster than the others) and there is also a chance that some numbers will be skipped at the end of a dataset if partitions are not balanced. You should test it to see what happens.
Post Reply