Page 1 of 1

problem with parallel job

Posted: Fri Jan 19, 2007 2:11 pm
by kirankota79
Hi,

I have an input seq file just passing through the transformer without any changes to an output sequential file.
When i do this task with server job, i can see the output same as the input.
But when i do this task with parallel job, the output file is scrambled vesion of the input file, i mean the output file doesn't contain the data in the order that we have in the input file, they are getting shuffled, output file first contains even columns and then odd columns. I am not able to figure out the problem. Is it problem with compiler? Need help!

Posted: Fri Jan 19, 2007 2:18 pm
by velagapudi_k
Are you saying the data is shuffled or the column order changed? If the data is shuffled, then it is totally expectable cause the data is distributed randomly among the number of nodes your datastage is using and then collected back at the output. If the column order is changed, then you have check your mapping.

Posted: Fri Jan 19, 2007 2:20 pm
by kirankota79
column order is not changed. only data is shuffled!

Posted: Fri Jan 19, 2007 2:28 pm
by I_Server_Whale
And that is expected in a parallel job. :wink:

Posted: Fri Jan 19, 2007 2:38 pm
by DSguru2B
This is due to the fact that the data is partitioned and executed on different nodes in parallel. One node could finish faster than the other and hence the shuffled behaviour. If you want the output to be in order, go to your output sequential file under 'Partitioning' tab, change the Collector Type to 'Sort Merge'. Choose the key on which the input is sorted on. That will make sure your data is intact on output.