Hash partion

subrat · Post by **subrat** » Fri Jan 25, 2008 2:34 am

Hi

Can anyone help me to understand how Hash partioning is happenening in parallel Job. How the key is generating?

Subrat

ray.wurlod · Post by **ray.wurlod** » Sat Jan 26, 2008 12:37 am

You identify the key column(s). The characters from those columns are used as input to a hashing function, which always returns the same uint32 value (the "hashvalue") for any given set of characters.

This "hashvalue" is divided by the number of partitions and the remainder is the partition number to which that row is allocated.

subrat · Post by **subrat** » Sat Jan 26, 2008 3:41 am

Could u please briefly explain whats mean by 'which always returns the same uint32 value'.
Moreover can a processing node contain more then one partion?

ray.wurlod wrote:You identify the key column(s). The characters from those columns are used as input to a hashing function, which always returns the same uint32 value (the "hashvalue") for any given set of characters ...

ray.wurlod · Post by **ray.wurlod** » Sat Jan 26, 2008 4:31 am

I don't believe that U's technical expertise is up to providing the explanation you seek so I will attempt your edification.

Code: Select all

hashvalue = f(keyvalue)

hashvalue is the uint32 result
f() is the partitioning algorithm

Code: Select all

partition_number_for_row = Mod(hashvalue, partition_count)

There is a maximum of one partition per processing node. Use of node pools may mean that there are fewer partitions than processing nodes.

subrat · Post by **subrat** » Sun Jan 27, 2008 10:32 pm

Thanks Ray for this valuable info...

Can u please also suggest me if we are doing hash partioning, is it always better to take all table keys as partion keys also? If yes then can i do same thing for other type of partion as well...

Moreover in case of join, lookup etc... are the data match within the partion or across partions also.

Thanks
Subrat

ray.wurlod wrote:I don't believe that U's technical expertise is up to providing the explanation you seek so I will attempt your edification.
Code: Select all
hashvalue = f(keyvalue) 
hashvalue is the uint32 r ...