Auto or Custom partitioning.

kollurianu · Post by **kollurianu** » Tue Jun 28, 2011 2:31 pm

Hi All,

When aggregator or Join or Sort .. are used.. hash partitioning..is used..as it is key partitioning method.. If "Auto" partioning is selected will the results be still correct and what partitioning would be inserted??
Manual says Auto would pick the right partitioning method as needed, if so by choosing partitioning externally is it to just have more control from our side or any other advantage?

Any inputs greatly appreciated.

Thank you all in advance.

mhester · Post by **mhester** » Tue Jun 28, 2011 2:51 pm

Yes, Auto should work just fine - the framework will put in sorts and repartitioners according to what you have defined as the key. It will almost always do this as a hash even for a single key that is entirely numeric (where modulus might have been a better choice).

Auto will also honor what you have done upstream in other operators or processes.

ray.wurlod · Post by **ray.wurlod** » Tue Jun 28, 2011 3:10 pm

(Auto) will always give a working mechanism - that is, an algorithm that is guaranteed to work. However, it may not be optimal in regard to performance. Once you clearly understand how partitioning works, you may be able to use custom settings to have your job perform faster. For example, Modulus is less expensive than Hash if the key is an Integer of some kind. A key-based algorighm is less expensive than Entire in a multi-machine configuration.