Hi
Can anyone help me to understand how Hash partioning is happenening in parallel Job. How the key is generating?
Subrat
Hash partion
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
You identify the key column(s). The characters from those columns are used as input to a hashing function, which always returns the same uint32 value (the "hashvalue") for any given set of characters.
This "hashvalue" is divided by the number of partitions and the remainder is the partition number to which that row is allocated.
This "hashvalue" is divided by the number of partitions and the remainder is the partition number to which that row is allocated.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Could u please briefly explain whats mean by 'which always returns the same uint32 value'.
Moreover can a processing node contain more then one partion?
Moreover can a processing node contain more then one partion?
ray.wurlod wrote:You identify the key column(s). The characters from those columns are used as input to a hashing function, which always returns the same uint32 value (the "hashvalue") for any given set of characters ...
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I don't believe that U's technical expertise is up to providing the explanation you seek so I will attempt your edification.
hashvalue is the uint32 result
f() is the partitioning algorithm
There is a maximum of one partition per processing node. Use of node pools may mean that there are fewer partitions than processing nodes.
Code: Select all
hashvalue = f(keyvalue)
f() is the partitioning algorithm
Code: Select all
partition_number_for_row = Mod(hashvalue, partition_count)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thanks Ray for this valuable info...
Can u please also suggest me if we are doing hash partioning, is it always better to take all table keys as partion keys also? If yes then can i do same thing for other type of partion as well...
Moreover in case of join, lookup etc... are the data match within the partion or across partions also.
Thanks
Subrat
Can u please also suggest me if we are doing hash partioning, is it always better to take all table keys as partion keys also? If yes then can i do same thing for other type of partion as well...
Moreover in case of join, lookup etc... are the data match within the partion or across partions also.
Thanks
Subrat
ray.wurlod wrote:I don't believe that U's technical expertise is up to providing the explanation you seek so I will attempt your edification.
hashvalue is the uint32 r ...Code: Select all
hashvalue = f(keyvalue)