We have just migrated from v6 to v7 of Parallel Extender DataStage. Most of the jobs that we migrated have been working fine which is good.
However I have noticed that the behaviour of Merge and Join appear to have changed in that it is no longer necessary to sort prior to the Merge or Join stage.
I have completed a number of simple tests that imply that including sorts before merge stages actually lengthen the amount of time it takes for the job to run.
Datastage(Orchestrate?) also complains with error messages such as:
APT_ParallelSortMergeOperator(0),0: WARNING: ParallelSortMerge is combined with its input.
APT_ParallelSortMergeOperator(0),0: WARNING: Partitioning for combined operators is straight-through, so ParallelSortMerge will do nothing.
Although I have never seen that error message before, your PX scripts might be suffering from the automatic insertion of hashes and sorts that happens in PX 6+. Check on the PX Forum for a discussion about this feature.
To add to ur discussions...I have used Orchestrate (before it Accential took over Torrent) and when we use Hash-Sort before any Joins and also its good to do when partitioning...to expect same data on one partition...so that when u re-run ur job...it makes sure u get the same result...
We had situations then with Orchestarte...with getting different result with diffrent runs.