Question on correlation between partitions and nodes
Posted: Mon Jan 21, 2013 12:45 pm
I am trying to understand the correlation between Partitions and Nodes.
Job design is two sequentail files feeding to Funnel and Remove Duplicate stage followed by transformer and another sequential file.
I hash partitioned on a key in a remove duplicate stage and I run with 8 node config file and the job ran in 8 partitions(from monitor). when I ran it through a debugger I noticed that two records with the same key value went into two different nodes (node1 and node4) but it deduped correctly.
1. Does this mean that those two nodes were in the same partition ? If so, How do we know what node goes into what partition ? My understanding is that same key value need to go to same partition during hash parition.
2. How is it going to distribute the records If I had 10 records with the same key value ?
What's strange is that If I just run 1 record(same key) in both the input files instead of all the records then both of them goes to a same node.
I am confused. Please let me know if I am not making any sense...
Job design is two sequentail files feeding to Funnel and Remove Duplicate stage followed by transformer and another sequential file.
I hash partitioned on a key in a remove duplicate stage and I run with 8 node config file and the job ran in 8 partitions(from monitor). when I ran it through a debugger I noticed that two records with the same key value went into two different nodes (node1 and node4) but it deduped correctly.
1. Does this mean that those two nodes were in the same partition ? If so, How do we know what node goes into what partition ? My understanding is that same key value need to go to same partition during hash parition.
2. How is it going to distribute the records If I had 10 records with the same key value ?
What's strange is that If I just run 1 record(same key) in both the input files instead of all the records then both of them goes to a same node.
I am confused. Please let me know if I am not making any sense...