Hi,
I have an input seq file just passing through the transformer without any changes to an output sequential file.
When i do this task with server job, i can see the output same as the input.
But when i do this task with parallel job, the output file is scrambled vesion of the input file, i mean the output file doesn't contain the data in the order that we have in the input file, they are getting shuffled, output file first contains even columns and then odd columns. I am not able to figure out the problem. Is it problem with compiler? Need help!
problem with parallel job
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 315
- Joined: Tue Oct 31, 2006 3:38 pm
-
- Premium Member
- Posts: 142
- Joined: Mon Jun 27, 2005 5:31 pm
- Location: Atlanta GA
Are you saying the data is shuffled or the column order changed? If the data is shuffled, then it is totally expectable cause the data is distributed randomly among the number of nodes your datastage is using and then collected back at the output. If the column order is changed, then you have check your mapping.
Venkat Velagapudi
-
- Premium Member
- Posts: 315
- Joined: Tue Oct 31, 2006 3:38 pm
-
- Premium Member
- Posts: 1255
- Joined: Wed Feb 02, 2005 11:54 am
- Location: United States of America
This is due to the fact that the data is partitioned and executed on different nodes in parallel. One node could finish faster than the other and hence the shuffled behaviour. If you want the output to be in order, go to your output sequential file under 'Partitioning' tab, change the Collector Type to 'Sort Merge'. Choose the key on which the input is sorted on. That will make sure your data is intact on output.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.