problem with parallel job

kirankota79 · Post by **kirankota79** » Fri Jan 19, 2007 2:11 pm

Hi,

I have an input seq file just passing through the transformer without any changes to an output sequential file.
When i do this task with server job, i can see the output same as the input.
But when i do this task with parallel job, the output file is scrambled vesion of the input file, i mean the output file doesn't contain the data in the order that we have in the input file, they are getting shuffled, output file first contains even columns and then odd columns. I am not able to figure out the problem. Is it problem with compiler? Need help!

velagapudi_k · Post by **velagapudi_k** » Fri Jan 19, 2007 2:18 pm

Are you saying the data is shuffled or the column order changed? If the data is shuffled, then it is totally expectable cause the data is distributed randomly among the number of nodes your datastage is using and then collected back at the output. If the column order is changed, then you have check your mapping.

kirankota79 · Post by **kirankota79** » Fri Jan 19, 2007 2:20 pm

column order is not changed. only data is shuffled!

I_Server_Whale · Post by **I_Server_Whale** » Fri Jan 19, 2007 2:28 pm

And that is expected in a parallel job.

DSguru2B · Post by **DSguru2B** » Fri Jan 19, 2007 2:38 pm

This is due to the fact that the data is partitioned and executed on different nodes in parallel. One node could finish faster than the other and hence the shuffled behaviour. If you want the output to be in order, go to your output sequential file under 'Partitioning' tab, change the Collector Type to 'Sort Merge'. Choose the key on which the input is sorted on. That will make sure your data is intact on output.