Merge and Join and Sort

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
adscot
Participant
Posts: 12
Joined: Mon Oct 13, 2003 3:04 am
Location: London

Merge and Join and Sort

Post by adscot »

Hello,

We have just migrated from v6 to v7 of Parallel Extender DataStage. Most of the jobs that we migrated have been working fine which is good.

However I have noticed that the behaviour of Merge and Join appear to have changed in that it is no longer necessary to sort prior to the Merge or Join stage.

I have completed a number of simple tests that imply that including sorts before merge stages actually lengthen the amount of time it takes for the job to run.

Datastage(Orchestrate?) also complains with error messages such as:

Code: Select all

        APT_ParallelSortMergeOperator(0),0: WARNING: ParallelSortMerge is combined with its input.
        APT_ParallelSortMergeOperator(0),0: WARNING: Partitioning for combined operators is straight-through, so ParallelSortMerge will do nothing.

Cheers,

Adrian
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Would you believe "it's a feature, not a fault" ?!!

I did read something about this recently (the sort being performed automatically), now where was that?...
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
bigpoppa
Participant
Posts: 190
Joined: Fri Feb 28, 2003 11:39 am

Merge and Join and Sort

Post by bigpoppa »

Although I have never seen that error message before, your PX scripts might be suffering from the automatic insertion of hashes and sorts that happens in PX 6+. Check on the PX Forum for a discussion about this feature.

-BP
adscot
Participant
Posts: 12
Joined: Mon Oct 13, 2003 3:04 am
Location: London

Post by adscot »

ray.wurlod wrote:Would you believe "it's a feature, not a fault" ?!!
I'd agree with that :D - still it would be nice if the documentation told us about these features...
Teej
Participant
Posts: 677
Joined: Fri Aug 08, 2003 9:26 am
Location: USA

Post by Teej »

adscot wrote:
ray.wurlod wrote:Would you believe "it's a feature, not a fault" ?!!
I'd agree with that :D - still it would be nice if the documentation told us about these features...
Yeah sure! Documentations coming right up. They're actually scheduled for Version 7.5! :)

Seriously. :)

-T.J.
Developer of DataStage Parallel Engine (Orchestrate).
vdreddy
Participant
Posts: 5
Joined: Fri Oct 10, 2003 11:32 am

Post by vdreddy »

To add to ur discussions...I have used Orchestrate (before it Accential took over Torrent) and when we use Hash-Sort before any Joins and also its good to do when partitioning...to expect same data on one partition...so that when u re-run ur job...it makes sure u get the same result...
We had situations then with Orchestarte...with getting different result with diffrent runs.
Post Reply