Preserving sort order in Datastage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
DataStage_Sterling
Participant
Posts: 26
Joined: Wed Jul 17, 2013 9:00 am

Preserving sort order in Datastage

Post by DataStage_Sterling »

Hi
We are migrating from Datastage 7.5.1 to 8.7.

Configuration
7.5.1 - 4node
8.7 - Grid 2x2 (APT_GRID_COMPUTENODES=2, APT_GRID_PARTITIONS=2)

DataStage job design (unchanged in migration)

Db -> sort -> transformer -> sequential file

The requirement is to create a sorted output in the sequential file but the keys are not part of the sequential file. They are used only to sort the records in the sort stage and eliminated in the transformer stage. All the stages have auto partitioning except sort stage which is hash paritioning on the keys (this is unchanged in 8.7)

Issue
The sequential file generated in 7.5.1 is completely sorted but the one generated in 8.7 job is out of order.

Question 1
I would like to know if there is a change in the way auto paritioning/collection works between 7.5.1 and 8.7

The following post is similar to the problem what I have:
viewtopic.php?t=149797

1. Running in sequential mode is not an option as we have 50 million records
2. Make the sort columns available to sequential file and choose sorted merge is last resort as the sort columns have to be removed in a unix script

Question 2
Is there any option to resolve this?


We found a couple of other instances where the auto paritioning was working fine in 7.5.1 but the data is different in 8.7. We had to specify an explicit partitioning and it worked.


Thank you for your opinions and expert suggestions in advance

DataStage_Sterling
DataStage_Sterling
Participant
Posts: 26
Joined: Wed Jul 17, 2013 9:00 am

Re: Preserving sort order in Datastage

Post by DataStage_Sterling »

Thank you. I used APT_NO_SORT_INSERTION and APT_NO_SORT_OPTIMIZATION. However, it still did not work as expected. In the interest of time, I had to fall back on the last resort to make the keys available to sort and later remove it from the script. It worked fine.

Thanks
DataStage Sterling
Post Reply