As per my understanding of Hash partitioning same key values will be partitioned into same processing node.
How Hash partitioning works with/without key values Sorted? Why Sorting is mandatory when the partitioning method is Hash? What happens if the records are Hash partitioned but not Sorted in stages like Join, Remove Duplicate, Change Capture etc.?
Hash partitioning and Sorting
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 73
- Joined: Wed Sep 30, 2009 5:20 am
Hash partitioning and Sorting
Regards,
Kannan
Kannan
-
- Premium Member
- Posts: 730
- Joined: Tue Nov 04, 2008 10:14 am
- Location: Bangalore
Re: Hash partitioning and Sorting
CorrectJayakannan wrote:As per my understanding of Hash partitioning same key values will be partitioned into same processing node.
Hash operator does not require sorted data so either ways the result is same with extra burden of sortingJayakannan wrote:How Hash partitioning works with/without key values Sorted?
Wrong, Its not requiredJayakannan wrote: Why Sorting is mandatory when the partitioning method is Hash?
You end up with improper data, The stages mandate sorting before processing and if there is no explicit sort tsort operators are placed wherever required (there are cases reported where this has not happened and data was not as expected)Jayakannan wrote:What happens if the records are Hash partitioned but not Sorted in stages like Join, Remove Duplicate, Change Capture etc.?
- Zulfi
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Some stages require sorted input because of the way they operate. This is unrelated to the partitioning algorithm used.
If you do not achieve key adjacency using a key-based partitioning algorithm your results can be simply wrong; for example on four nodes summarising by US state, you can end up with as many as 200 groups (4 x 50).
If you do not achieve key adjacency using a key-based partitioning algorithm your results can be simply wrong; for example on four nodes summarising by US state, you can end up with as many as 200 groups (4 x 50).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.