Page 1 of 1

Entire Partitioner issue in Data set

Posted: Tue May 24, 2011 9:52 am
by Poornimayvs
Hi all,

I am trying to see the difference between different types of Partitioner. I used a flat file as my input stage and output is a Data set. My input has got 11 records, When i use the Entire Partitioner in the Partitioning tab i am seeing that the output generated contains duplicate records i mean my output is 22 records instead of 11.

Can any one help me regarding this issue.

Thanks.

Posted: Tue May 24, 2011 11:04 am
by greggknight
Entire:
means just that, that the entire data is written to all nodes.
I am assuming you have a two node config.

Re: Entire Partitioner issue in Data set

Posted: Tue May 24, 2011 6:28 pm
by SURA
You should read the doc and understand where to use which partition!

Posted: Wed May 25, 2011 12:07 am
by singhald
when you select "entire partition" it basically copy all records to number of nodes defined in node configuration file.

for more details you go through Advance parallel job developer guide