Page 1 of 1

Automatic Partitioning Not Working

Posted: Mon Mar 10, 2008 1:19 pm
by ds2000
Im using 6 datasets using Left Join stage. 3rd dataset is not joining properly. Join works fine if i use the Hash partitioning on 3rd dataset and mainstream data.

But then should i have to revert back hash partitioning to Automatic so that output can join properly to other remaining datasets. Please suggest.

Posted: Mon Mar 10, 2008 3:08 pm
by santhu
As a pre-requisite to JOIN stage, the dataset data should be pre- hash partitioned and sorted on the JOIN key columns before they are fed to the JOIN stage.

If the data in the dataset is hash partitioned and sorted on JOIN keys in the previous job, then you can retain the same partitioning by setting Partitioning to "SAME" instead of "AUTO".

Hope this helps.

Posted: Mon Mar 10, 2008 3:08 pm
by santhu
As a pre-requisite to JOIN stage, the dataset data should be pre- hash partitioned and sorted on the JOIN key columns before they are fed to the JOIN stage.

If the data in the dataset is hash partitioned and sorted on JOIN keys in the previous job, then you can retain the same partitioning by setting Partitioning to "SAME" instead of "AUTO".

Hope this helps.

Posted: Mon Mar 10, 2008 4:08 pm
by ray.wurlod
Dump the score to learn what partitioning Auto actually gives you.

Posted: Mon Mar 10, 2008 8:04 pm
by Nripendra Chand
just check whether 'APT_NO_PART_INSERTION' and 'APT_NO_SORT_INSERTION' env variables are disabled or not. These two variables decide whether auto partitioning and auto sorting should occur or not.

Thanks,
Nripendra

Posted: Mon Mar 10, 2008 9:08 pm
by ray.wurlod
A clarification, in view of the thread subject. The second of these environment variables governs whether tsort operators will be inserted, and has nothing at all to do with partitioning.