Need help with sorting the data

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Need help with sorting the data

Post by zulfi123786 »

I am joining two datasets using the join stage, to sort the incoming data which of the two is best:

1) In link sort
2) using 2 sort stages explicitly on both links

It would be of great help if the reason is specified.

Thanks..........
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

Both are the same as far as only sorting (using datastage sort) is concerned as both will insert a tsort operator.

but an explicit sort stage give you more options than inlink sort like
getting key change column, liberty to select the utility used for sorting (Datastage/Unix), It can dump stats and also you can restrict the memory usage from the stage itself.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

My concern is only sorting the data, no other options required
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

zulfi123786 wrote:My concern is only sorting the data, no other options required
Then i don't think it makes any difference during execution but I prefer explicit sort stage.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
srinivas.g
Participant
Posts: 251
Joined: Mon Jun 09, 2008 5:52 am

Post by srinivas.g »

Performance wise Inline sort is best compare to explicit sort stages.
Srinu Gadipudi
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

srinivas.g wrote:Performance wise Inline sort is best compare to explicit sort stages.
Based on what? Under the covers they're both the same tsort operator.
-craig

"You can never have too many knives" -- Logan Nine Fingers
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

could you please mention which document of datastage discusses tsort.....

I didnot find anything mentioned in the DS parallel job developer guide saying the sort stage inserts a tsort operator.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You'd probably have to go back to an ORCHESTRATE manual for that. Check the Generated OSH tab in the job, you'll see them there.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

srinivas.g wrote:Performance wise Inline sort is best compare to explicit sort stages.
I disagree 100%. But I'd be interested to hear your reasons.

Two reasons an explicit Sort stage is better (and can be better for performance):
  • you can control the amount of memory allocated

    you can generate key change indicators
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply