Hash Partition and Perform sort on the link partition tab

Marley777 · Post by **Marley777** » Tue Jan 04, 2011 3:23 pm

Hi wondering if someone can help me understand when to use sorting along with partition type = hash.

For the questions below assume that I want to keep data with key columns together on the same processing nodes, APT_NO_SORT_INSERTION = FALSE, and
APT_NO_PART_INSERTION = FALSE.

1 - If I'm in the sort stage and already selected the fields I wish to sort by then do I need to sort again by selecting the 'perform sort' on the partitioning tab? I ask because when I select partition type = 'hash' on the partitioning tab I get the option to 'perform sort'.

2- If I didn't need to sort the data, but wanted all data with the same keys on the same processing nodes; could I on the partitioning tab set partition type 'hash' and select the field needed for hash partitioning without doing a 'perform sort' and usage 'sorting partitioning'?

3- What is 'usage = sorting partitioning'?

Any help would be greatly appreciated.

jwiles · Post by **jwiles** » Tue Jan 04, 2011 4:49 pm

Marley777 wrote:Hi wondering if someone can help me understand when to use sorting along with partition type = hash.

For the questions below assume that I want to keep data with key columns together on the same processing nodes, APT_NO_SORT_INSERTION = FALSE, and
APT_NO_PART_INSERTION = FALSE.

1 - If I'm in the sort stage and already selected the fields I wish to sort by then do I need to sort again by selecting the 'perform sort' on the partitioning tab? I ask because when I select partition type = 'hash' on the partitioning tab I get the option to 'perform sort'.

No, typically you don't want to select the sort option on the input link to the sort stage. You should however select the appropriate partition options if you've not already partitioned the input data.

2- If I didn't need to sort the data, but wanted all data with the same keys on the same processing nodes; could I on the partitioning tab set partition type 'hash' and select the field needed for hash partitioning without doing a 'perform sort' and usage 'sorting partitioning'?

Yes, but the usage would be partitioning only (you're not sorting!).

3- What is 'usage = sorting partitioning'?

When you've chosen the sorting option on the input tab, this allows you to choose how a particular column on the input link is being utilized: sorting, partitioning or both.

Hope this helps!

Marley777 · Post by **Marley777** » Wed Jan 05, 2011 8:10 am

Thanks for reading.

1- So what will happen if in the sort stage on partitioning tab I only select hash partition and the the keys, but do not select perform sort and set the 'usage'? Usage is only available when I 'perform sort' on the partition tab?

2 - why would the sort stage would have the 'perform sort' option on the partition tab?

jwiles · Post by **jwiles** » Wed Jan 05, 2011 12:21 pm

Marley777 wrote:Thanks for reading.

1- So what will happen if in the sort stage on partitioning tab I only select hash partition and the the keys, but do not select perform sort and set the 'usage'? Usage is only available when I 'perform sort' on the partition tab?

Your data will be partitioned and then sorted according to the options selected on the Stage->Properties tab.

The usage column is only applicable to the input link part/sort options and doesn't relate to the stage->properties tab and only appears when you select the sort option in conjunction with Hash, Modulus or Range partitioning.

2 - why would the sort stage would have the 'perform sort' option on the partition tab?

That is part of the common stage interface within the Designer client, similar to Microsoft's Common Controls library.

Marley777 · Post by **Marley777** » Wed Jan 05, 2011 12:28 pm

jmiles, thanks for your help.

DSXchange

Hash Partition and Perform sort on the link partition tab

Hash Partition and Perform sort on the link partition tab

Re: Hash Partition and Perform sort on the link partition ta