This doubt is related to partitioning basics. Please let me know the answer or pointer to the answer.
Here is the situation:
I have 2 extract jobs. Both jobs create populate 1 dataset each. Extract 1 fetches 17 million rows. Extract 2 fetches 7 million rows. Both the datasets have a key viz. 'KEY'
Now I have used hash partition on KEY in both the extract jobs. Now can I expect, a particular key value to go in the same partition of both the datasets? If yes, how? (I might run the 2 extract jobs on 2 different days).
(I think the answer is 'No' and I will have to repartition the data in subsequent transform jobs...)
Thanks
How does Partitioning work for more than 1 job
Moderators: chulett, rschirm, roy
The same key using the same hashing algorithm with the same APT_CONFIG file will go to the same partition number on both files.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 467
- Joined: Tue Mar 20, 2007 6:36 am
- Location: Chennai
- Contact:
Do you have 2 datasets or are you appending one dataset to the other?
If you have 2 datasets which are not being appended you will have all the same keys in the same partitions.
If you have 2 datasets which are not being appended you will have all the same keys in the same partitions.
Minhajuddin
<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The configuration file all of the rest of us have been talking about is the one whose pathname is specified by APT_CONFIG_FILE environment variable, not the RT_CONFIGnnn hashed file in the Repository (which has nothing whatsoever to do with Hash partitioning except that the specification in your design is stored in RT_CONFIGnnn).MVL wrote:Ok. I stated it as a general rule. But by looking at your message I think thats not the case. Can 2 jobs share same RT_CONFIG file? Is this file any way related to hash partitioning?
Thanks
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.