Hash partitioning on the same subset key
Posted: Thu Sep 15, 2011 3:41 am
Hi ,
In my job two stages - remove duplicate and join stage are placed side by side.
I have done key based hash partitioning for the first stage (remove duplicate).The key for the first stage(remove duplicate) is columns A and B.For the next join stage the key is B.
My query is do I need to again repartition the data in join stage on column B or I can go with "same" partitioning in the join stage as data is already key partitioned in the previous stage on column A. B and B is subset of A,B ?
Thanks and Regards
Avik Dasgupta
In my job two stages - remove duplicate and join stage are placed side by side.
I have done key based hash partitioning for the first stage (remove duplicate).The key for the first stage(remove duplicate) is columns A and B.For the next join stage the key is B.
My query is do I need to again repartition the data in join stage on column B or I can go with "same" partitioning in the join stage as data is already key partitioned in the previous stage on column A. B and B is subset of A,B ?
Thanks and Regards
Avik Dasgupta