ABOUT HASH PERTITIONING
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 12
- Joined: Sun Sep 23, 2007 12:21 am
- Location: chennai
-
- Participant
- Posts: 467
- Joined: Tue Mar 20, 2007 6:36 am
- Location: Chennai
- Contact:
Go through the Parallel Job Developer's Guide.
Minhajuddin
<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
Hash is a key based partitioning algorithm. It can be used for any data type for the key value. The bytes (or the characters) making up the key are processed through a function that yields a positive interger called a hash value. This number is divided by the number of partitions and the remainder is the node number(partition) where that key value belongs. So for every distinct key value, all instances will end up in the same partition.
Hope this helps!!
Hope this helps!!
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Hashing is a widely-used technique for reliably (repeatably) selecting one from a finite number of alternatives based on a given value.
In parallel jobs, hash partitioning chooses one from the finite number of processing nodes based on the combination of values provided as "key" on the partitioning tab.
Unless you have the actual partitioning algorithm code (which you don't) you can not predict which node will be chosen for any particular key value, except in extremely simple cases. However, the creators of DataStage are particularly proficient at writing good hashing algorithms that yield reasonably even spread.
In parallel jobs, hash partitioning chooses one from the finite number of processing nodes based on the combination of values provided as "key" on the partitioning tab.
Unless you have the actual partitioning algorithm code (which you don't) you can not predict which node will be chosen for any particular key value, except in extremely simple cases. However, the creators of DataStage are particularly proficient at writing good hashing algorithms that yield reasonably even spread.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.