Partitioning

Bilwakunj · Post by **Bilwakunj** » Sat May 14, 2005 1:39 pm

Hi ,
In my DDL , I've got 4 columns as the primary key of the columns and for all the join stage I'm using the "Hash" partitioning , is this the right approach or I should go for Auto. I've got the impression that when we say "Auto" datastage uses the "round robin" or "entire" partitioning internalley depending on the previos stages and the preserve partitioning flag, as per my requirement I shdn't be going for either of them so I'm using "Hash". Just wondering is this correct approach?
Thanks in advance.

GIDs · Post by **GIDs** » Sat May 14, 2005 3:55 pm

Using HASH is better off... you are gaurenteed of perfect results. You have to sort the input on all input links (if not previously sorted) in the same order as your join key, but partition on one/two columns that you think would provide a good partitioning of your data and which will also group your data into distinct data sets.