Sort stage and Partition sort

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
arunkumar1111
Participant
Posts: 19
Joined: Sun Jun 29, 2008 10:19 pm

Sort stage and Partition sort

Post by arunkumar1111 »

Hi All

What is the difference in sorting using a sort stage and sorting done inside a stages like Remove duplicate Joins etc...

Why do we have sort there when we have an explict sort stage ?

Which is more effective ?
Thanks and Regards
Arun
Known is a drop and Unknown is an Ocean
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

In all cases the same tsort operator is used.

Input link sorting constrains you to all default settings. If that's all you need it's a way to avoid cluttering up your design.

But the Sort stage gives additional choices, such as the ability to generate key change columns, the ability to perform sub-sorts only, the ability to control how much memory is allocated for sorting. My favourite is the ability to specify "don't sort" in the Sort stage, which seems counter-intuitive but prevents DataStage from inserting tsort operators where sorted input is required.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply