Page 1 of 1

Sort Merge

Posted: Thu Aug 09, 2007 9:57 am
by gbusson
Hi all,

I 've a question about the sort merge library in datastage EE (PX).

I thought first that it worked fine only when partitions are sorted on the key(s).

I've been told on dsxchange that the "pre-sort" was unnecessary.

I made a test, the result tells me that the sort order is totally different beetween the 2 methods (pre sort + sort merge vs sort merge only)

Previous tests told me that pre sort + sort merge make good results.

So, who's right?

Thnaks for your help.

Posted: Thu Aug 09, 2007 10:04 am
by michaeld
If you do not insert a sort operator then datastage will insert one for you behind the scenes. It will sort on the keys that are being merged.

It is best to insert the sort operator yourself because that way you know exactly what it is sorting on and you have control over other aspects of the sort. Like memory usage, not sorting columns that are already sorted, etc...

The mixed result may also be due to datastage repartitioning your data. Set the partitioning to SAME to make sure that datastage does not repartition and check the score in the log for more details on what it is doing behind the scenes.

Posted: Fri Aug 10, 2007 1:31 am
by gbusson
Hi, I fotgot to tell one thing :

I set nosortinsertion nopartinsertion!