Default Partitioning method

gsherry1 · Post by **gsherry1** » Fri Mar 17, 2006 1:42 pm

Hello Forum,

When source is sequential file and I pass this through a transformer with default partitioning methods specified ie 'Auto'. What exactly is done. Is round robin method used, or does it perform hash partitioning by the key fields from my column metadata?

Thanks,

Greg

vmcburney · Post by **vmcburney** » Fri Mar 17, 2006 11:08 pm

The default partitioning method is round robin.

kumar_s · Post by **kumar_s** » Fri Mar 17, 2006 11:12 pm

By default it is roundrobin. But it varies depends upon the stages you use. If you use the same auto partion in DB2 stage, it used DB2 partiton. If you use in sort or join, it automatically inserts hash. Since it is not predictable, it is alway advisable to use delibrate partition mehton.

gsherry1 · Post by **gsherry1** » Thu Mar 30, 2006 12:52 pm

What if the data was already partitioned, will the join automatically repartition it by my join key?

For example. Suppose I have 2 Sequential file inputs which are linked to respective datasets using hash partitioning on key1. Then in second job I have these two datasets as inputs into join stage and am joining on key2. Will the join stage automatically repartition the datasets to key2?

- Greg

emma · Post by **emma** » Thu Mar 30, 2006 1:19 pm

If Preserve-Partition flag is "Set" the partitioning is preserved if not ,yes the join stage will change the repartition.