Page 1 of 1

Partition while using lookup

Posted: Fri Feb 13, 2004 11:46 pm
by elavenil
Hi,

We use PX in DS 6.0.1. We are using datasets as look ups in the lookup stage. Auto Partition method used in the lookup stage and we seem to get the right data but when we attended the PX training, we were told that 'Entire' partition method must be used in order to get lookup data from the lookup datasets.

Could anyone confirm this whether any partition can be used or only entire partition method must be used.

Thanks in advance.

Regards
Saravanan

Note: 4 nodes from Single server are used.

Posted: Sat Feb 14, 2004 4:34 pm
by ray.wurlod
Data will still be partitioned. However, you have no way of knowing in advance in which partition your particular key will occur, so you have to use the Entire partitioning method so that the lookup can "see" the entire data set. PX will automatically look after ensuring that the retrieved row ends up in the correct partition for processing.

Re: Partition while using lookup

Posted: Sun Feb 15, 2004 11:07 pm
by Teej
We use PX in DS 6.0.1. We are using datasets as look ups in the lookup stage. Auto Partition method used in the lookup stage and we seem to get the right data but when we attended the PX training, we were told that 'Entire' partition method must be used in order to get lookup data from the lookup datasets.
That is one option. Unfortunately, it does not invoke the parallel lookup method. That is fixed for 7.0 (or 7.0.1, hazy memory right now).

You are recommended to use hash partitioning for both input and reference links in order to take advantage of parallel lookups.

The fix defaults the stage to hash for the provided key fields when you select auto.

-T.J.

Posted: Mon Feb 16, 2004 12:46 am
by praj
as Teej said its recom. to hv hash partitioning on the keys.
And i think its better to hv the lookup fields sorted(although its not necessary for DS) . U can use the perform sort checkbox for the same in partitioning sheet :) .

Posted: Mon Feb 16, 2004 8:46 am
by Teej
It is not recommended to sort. There is a known bug with the Lookup stage for 6.x that would crash the job if you attempt to sort the data with several conditions.

However, it is still not recommended to sort because that takes away the advantage of the lookup stage -- rapid lookup. You might as well use the join stage if you sort.

-T.J.

Posted: Mon Feb 16, 2004 9:37 pm
by elavenil
Thanks for your suggestions and detailed explanations.

I will use hash partition for input and reference links.

Regards
Saravanan