Despite reading through a large number of posts as well as the DS documentation, I'm still not entirely clear on some aspects of partitioning for lookup stages.
I'd like to perform a lookup against a text file and have chosen to use a lookup stage. By default, the "Auto" option is selected as the partition type for both the primary and reference streams. My questions are as follows:
1. With "Auto" selected for both the primary and reference streams, can I rely on DataStage to manage partitioning such that records will successfully be retrieved based on the Key Expression defined in the "main" lookup screen? (I want to make sure that DS will reference the correct partition for the lookup record.)
2. If the partition type is set to "Hash" in the primary stream, will setting the partition type and keys the same way in the reference stream also ensure that the lookup is successful? How is this better or worse than using the "Auto" setting?
3. If the partition type is set to "Entire" for the reference stream, does the partition type of the primary stream matter?
4. Apart from the Partition Key, there are two other types of keys that appear in the Lookup stage: the "Key" column defined in the metadata for a column and the Key Expression in the lookup. Do either of these have any impact on partitioning?
I realize that this is a long post, but I would appreciate any help you can provide.
Peter
Partitioning in Lookup Stage
Moderators: chulett, rschirm, roy
1. Most probably yes (as per documentation), but If I were you, I wont take risk to leave Datastage to decide upon the partition. I ll do it manually.
2. Yes if both the streams are partitioned on HASH on the lookup key, and if you are sure that both will be partitioned exactly the same, it will work. There may be some cases where, Char type in one stream and Varchar type in other stream, which will lead to mismatch.
3. Thats why, "Entire" is preferred. It can match any partition type from Input stream. Be aware that, "Entire" in the input stream will create dulplicates.
4. Partitioning on the Lookup key will make the right choice.
2. Yes if both the streams are partitioned on HASH on the lookup key, and if you are sure that both will be partitioned exactly the same, it will work. There may be some cases where, Char type in one stream and Varchar type in other stream, which will lead to mismatch.
3. Thats why, "Entire" is preferred. It can match any partition type from Input stream. Be aware that, "Entire" in the input stream will create dulplicates.
4. Partitioning on the Lookup key will make the right choice.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
For a Lookup stage, "Auto" selects:
- Round Robin for the stream input link
Entire for the reference input link(s)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thanks to both of you for your quick (and helpful) responses.
Ray - I do have one follow-up question...
Your response suggests that "Auto" can mean different things in different stages. This certainly makes sense considering the various functions performed. In which DataStage document can I find out the behaviour of the "Auto" setting for each stage type? (I was looking through the documentation earlier today, and don't remember seeing this behaviour explicitly described for the lookup stage, though I was probably looking in the wrong place.)
Ray - I do have one follow-up question...
Your response suggests that "Auto" can mean different things in different stages. This certainly makes sense considering the various functions performed. In which DataStage document can I find out the behaviour of the "Auto" setting for each stage type? (I was looking through the documentation earlier today, and don't remember seeing this behaviour explicitly described for the lookup stage, though I was probably looking in the wrong place.)
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
It's taught in the IBM Advanced DataStage class. Therefore it's documented in the training manual. I'm not aware that it's documented in any of the PDF manuals that ship with the product.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.