modulus partition

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
just4u_sharath
Premium Member
Premium Member
Posts: 236
Joined: Sun Apr 01, 2007 7:41 am
Location: Michigan

modulus partition

Post by just4u_sharath »

join stage has two inputs. right now i am hash partitioning both the input data to join stage based on an integer column. (i want same value columns stay on same node). Can i do the same using Modulus partition. Will modulus also ensure that related records stay on same node. What is the difference between both partitions and which is fastest.
Replies are always appreciated.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Modulus is more efficient than Hash, because the step of calculating the "hashvalue" is not required - the integer itself is divided by the number of partitions and the remainder is used as the partition for that row. This will guarantee key adjacency. Note, however, that Modulus is only relevant for an integer key.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply