Partition method for creating key change column

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
udayanguha
Premium Member
Premium Member
Posts: 37
Joined: Wed Oct 29, 2014 10:48 pm
Location: Ohio

Partition method for creating key change column

Post by udayanguha »

Hi,
I am trying to create a key change column through sort stage. In the partition tab, shall I specify it as auto partition and Datastage will take care of the best partitioning method or shall I explicitly mention a hash partition in the property? I have heard different views from people. Some people suggest to always mention explicitly the partition method and some suggest to leave it as auto. A bit confused now what to use?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Auto will give you Hash on the (entire) Sort key. It may be more efficient to specify explicitly under a couple of circumstances.
  • If there is high cardinality on the first Sort key, you may prefer to partition on that key only.

    If the Sort key is an integer, then the Modulus algorithm will be more efficient than Hash.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Re: Partition method for creating key change column

Post by ray.wurlod »

udayanguha wrote: Datastage will take care of the best partitioning method
Not quite true. DataStage will select a partitioning method that will always work. It may not be "best". It will be guaranteed to partition the data correctly for the stage in question.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply