Page 1 of 1

Regarding sorting data before joining

Posted: Tue Mar 07, 2006 12:22 am
by ThilSe
Hi,

I have a doubt.

Is there any difference between using a SORT Stage and PERFORM SORT in partitioning tab while joining the data other than :
->the ability to set Already sorted option and
->use of unix sort
->Visual appearance

Or there are any other benefits?

Thanks/Regards
Senthil

Posted: Tue Mar 07, 2006 12:43 am
by ameyvaidya
2 more:
1 For large data set sizes (>20 MB) the Sort Stage is better.

2 <:?: >I dont believe the On-Link Sort can do a sequential sort.. </:?: >

Posted: Tue Mar 07, 2006 7:53 am
by ray.wurlod
What do you mean by "sequential sort"?

Posted: Tue Mar 07, 2006 8:40 am
by kumar_s
For better performance, dedicated sort sate can always be chosed. Which has its own strach disc space.
Unix sort make use of the unix level sort option. It may be more effecient for data with less number of records.
If the incoming data is previously sorted, you can enable Already sorted option to get bette performance by avoiding resorting.

Posted: Wed Mar 08, 2006 12:53 am
by ameyvaidya
ray.wurlod wrote:What do you mean by "sequential sort"?
What i meant was that the Sort stage can work in both sequential mode and parallel Mode. while the on-link sort, as it has to work on partitioned data, can't.

I dont recollect if on-link sorting is available on the Input of stages running in Sequential Mode.