Auto or Custom partitioning.

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
kollurianu
Premium Member
Premium Member
Posts: 614
Joined: Fri Feb 06, 2004 3:59 pm

Auto or Custom partitioning.

Post by kollurianu »

Hi All,

When aggregator or Join or Sort .. are used.. hash partitioning..is used..as it is key partitioning method.. If "Auto" partioning is selected will the results be still correct and what partitioning would be inserted??
Manual says Auto would pick the right partitioning method as needed, if so by choosing partitioning externally is it to just have more control from our side or any other advantage?

Any inputs greatly appreciated.

Thank you all in advance.
mhester
Participant
Posts: 622
Joined: Tue Mar 04, 2003 5:26 am
Location: Phoenix, AZ
Contact:

Post by mhester »

Yes, Auto should work just fine - the framework will put in sorts and repartitioners according to what you have defined as the key. It will almost always do this as a hash even for a single key that is entirely numeric (where modulus might have been a better choice).

Auto will also honor what you have done upstream in other operators or processes.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

(Auto) will always give a working mechanism - that is, an algorithm that is guaranteed to work. However, it may not be optimal in regard to performance. Once you clearly understand how partitioning works, you may be able to use custom settings to have your job perform faster. For example, Modulus is less expensive than Hash if the key is an Integer of some kind. A key-based algorighm is less expensive than Entire in a multi-machine configuration.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply