Page 1 of 1

modulus partition

Posted: Sat Jan 26, 2008 1:40 am
by just4u_sharath
join stage has two inputs. right now i am hash partitioning both the input data to join stage based on an integer column. (i want same value columns stay on same node). Can i do the same using Modulus partition. Will modulus also ensure that related records stay on same node. What is the difference between both partitions and which is fastest.
Replies are always appreciated.

Posted: Sat Jan 26, 2008 4:28 am
by ray.wurlod
Modulus is more efficient than Hash, because the step of calculating the "hashvalue" is not required - the integer itself is divided by the number of partitions and the remainder is used as the partition for that row. This will guarantee key adjacency. Note, however, that Modulus is only relevant for an integer key.