How to decide which partition to be used in what kind of job

taral · Post by **taral** » Tue May 11, 2010 4:21 am

We have different type of partition ie.
Hash Partition
Entire Prtition
Round robin
Same.

How can we decide which partition has to be used?

srinivas.g · Post by **srinivas.g** » Tue May 11, 2010 4:44 am

By default it is Auto.

join,merge ---> hash
lookup-->entire

chulett · Post by **chulett** » Tue May 11, 2010 6:16 am

Lookup-->Entire as a blanket statement? Yikes. To the OP - practice, experience and experimentation help.

nagarjuna · Post by **nagarjuna** » Tue May 11, 2010 8:46 am

It depends on type of req you are having .But decide whether its a keyed partitioning or non-key then as mentioned by craig experiment and decide .

ray.wurlod · Post by **ray.wurlod** » Tue May 11, 2010 5:03 pm

Welcome aboard.

There are actually eight choices for partitioning algorithm, and four for collecting. However, the decision is usually easier than that.

If you don't need to keep like-valued keys together, use an algorithm that spreads rows as evenly as possible over processing nodes. If you do need to keep like-valued keys together, use a key-based algorithm (modulus for a single integer key, hash otherwise). Range partitioning is rarely used, and requires that you pre-process your data to generate a "range map". Entire for reference input to Lookup stage is handy in that it guarantees that all valid lookups will succeed, but comes at a cost on cluster/grid environments in that all records have to be sent to all nodes (in an SMP environment one copy is lodged in shared memory).