Hi,
We use PX in DS 6.0.1. We are using datasets as look ups in the lookup stage. Auto Partition method used in the lookup stage and we seem to get the right data but when we attended the PX training, we were told that 'Entire' partition method must be used in order to get lookup data from the lookup datasets.
Could anyone confirm this whether any partition can be used or only entire partition method must be used.
Thanks in advance.
Regards
Saravanan
Note: 4 nodes from Single server are used.
Partition while using lookup
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Data will still be partitioned. However, you have no way of knowing in advance in which partition your particular key will occur, so you have to use the Entire partitioning method so that the lookup can "see" the entire data set. PX will automatically look after ensuring that the retrieved row ends up in the correct partition for processing.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Re: Partition while using lookup
That is one option. Unfortunately, it does not invoke the parallel lookup method. That is fixed for 7.0 (or 7.0.1, hazy memory right now).We use PX in DS 6.0.1. We are using datasets as look ups in the lookup stage. Auto Partition method used in the lookup stage and we seem to get the right data but when we attended the PX training, we were told that 'Entire' partition method must be used in order to get lookup data from the lookup datasets.
You are recommended to use hash partitioning for both input and reference links in order to take advantage of parallel lookups.
The fix defaults the stage to hash for the provided key fields when you select auto.
-T.J.
Developer of DataStage Parallel Engine (Orchestrate).
It is not recommended to sort. There is a known bug with the Lookup stage for 6.x that would crash the job if you attempt to sort the data with several conditions.
However, it is still not recommended to sort because that takes away the advantage of the lookup stage -- rapid lookup. You might as well use the join stage if you sort.
-T.J.
However, it is still not recommended to sort because that takes away the advantage of the lookup stage -- rapid lookup. You might as well use the join stage if you sort.
-T.J.
Developer of DataStage Parallel Engine (Orchestrate).