Page 1 of 1

Need help with sorting the data

Posted: Wed Dec 23, 2009 3:54 am
by zulfi123786
I am joining two datasets using the join stage, to sort the incoming data which of the two is best:

1) In link sort
2) using 2 sort stages explicitly on both links

It would be of great help if the reason is specified.

Thanks..........

Posted: Wed Dec 23, 2009 4:17 am
by priyadarshikunal
Both are the same as far as only sorting (using datastage sort) is concerned as both will insert a tsort operator.

but an explicit sort stage give you more options than inlink sort like
getting key change column, liberty to select the utility used for sorting (Datastage/Unix), It can dump stats and also you can restrict the memory usage from the stage itself.

Posted: Wed Dec 23, 2009 5:54 am
by zulfi123786
My concern is only sorting the data, no other options required

Posted: Wed Dec 23, 2009 7:25 am
by priyadarshikunal
zulfi123786 wrote:My concern is only sorting the data, no other options required
Then i don't think it makes any difference during execution but I prefer explicit sort stage.

Posted: Wed Dec 23, 2009 7:50 am
by srinivas.g
Performance wise Inline sort is best compare to explicit sort stages.

Posted: Wed Dec 23, 2009 8:09 am
by chulett
srinivas.g wrote:Performance wise Inline sort is best compare to explicit sort stages.
Based on what? Under the covers they're both the same tsort operator.

Posted: Wed Dec 23, 2009 10:31 am
by zulfi123786
could you please mention which document of datastage discusses tsort.....

I didnot find anything mentioned in the DS parallel job developer guide saying the sort stage inserts a tsort operator.

Posted: Wed Dec 23, 2009 10:59 am
by chulett
You'd probably have to go back to an ORCHESTRATE manual for that. Check the Generated OSH tab in the job, you'll see them there.

Posted: Wed Dec 23, 2009 4:21 pm
by ray.wurlod
srinivas.g wrote:Performance wise Inline sort is best compare to explicit sort stages.
I disagree 100%. But I'd be interested to hear your reasons.

Two reasons an explicit Sort stage is better (and can be better for performance):
  • you can control the amount of memory allocated

    you can generate key change indicators