Auto Partition method in Sort Stage

NigeGriff · Post by **NigeGriff** » Fri Apr 24, 2009 9:38 am

When using the auto partition method in a sort stage, if i set the partition key to be column 1 but sort on columns 1 and column 2, the incoming data to the sort stage is partitioned on columns 1 and 2 rather than column 1 only.

My understanding was that auto would partition data based on the partition key specified on the stage input tab.

ray.wurlod · Post by **ray.wurlod** » Fri Apr 24, 2009 3:22 pm

If partitioning is (Auto) you don't get to set the partitioning key at all.

(Auto) on inputs to a Join stage will use all the sort keys as the partitioning key and Hash as the partitioning algorithm.

sureshreddy2009 · Post by **sureshreddy2009** » Sat Apr 25, 2009 2:01 am

hi
auto is a default partiioning method
when we select auto it dont allow to select key for sort at link level or unique etc, conclusion is what ever we had given in keys place in sort stage then it sorts based on that key or keys, here one doubt when we select column1 as sort key and in partitioning place if we select partition other than auto like hash, modulus etc, and partition keys are column1, column2 then it performs sort on column1 or it aborts

NigeGriff · Post by **NigeGriff** » Mon Apr 27, 2009 2:32 am

Thanks for the quick reply Ray.

In summary i think what this means is that using auto to decide the partitioning keys and method for a sort stage is OK except for the situation were the next stage after the sort is one such as a drop duplicates that needs to drop records based on column1.

ray.wurlod · Post by **ray.wurlod** » Mon Apr 27, 2009 2:40 am

You could always add a second Sort stage sorting just on column1 in mode "don't sort, already sorted" and have that Sort stage generate a Cluster Key Change column.

NigeGriff · Post by **NigeGriff** » Mon Apr 27, 2009 4:41 am

Thanks for the quick reply Ray.

In summary i think what this means is that using auto to decide the partitioning keys and method for a sort stage is OK except for the situation were the next stage after the sort is one such as a drop duplicates that needs to drop records based on column1.

DSXchange

Auto Partition method in Sort Stage

Auto Partition method in Sort Stage

Auto is default partiiton method