Page 1 of 1

Sorting with Hash parition

Posted: Thu Jun 29, 2006 3:41 pm
by koolnitz
Hi,

I have a query on JOIN stage with Hash partitioning.

Is there any scenario when Sorting should not be done while joining two streams with Hash partitioning? Or should I blindly check the Sort option whenever partition type is set to Hash.

Thanks!

Posted: Thu Jun 29, 2006 3:58 pm
by ray.wurlod
Read the chapter in the Parallel Job Developer's Guide on the Join stage. It

Code: Select all

requires 
its inputs to be sorted (on join key columns). The partitioning algorithm is irrelevant to this requirement.

The only time you would elect not to use Sort on the input link is where you are totally confident that the data on the input links are already correctly sorted.