Problem with Inner Join
Posted: Tue Jun 19, 2007 12:29 am
Hi,
I am joining two datasets thro an inner join.
These two datasets are previously sorted and partitioned on 3 keys in another job.
In the INNER JOIN job I am doing some transformations to one of the dataset and then doing an inner join. I am now doing a join on 5 keys and the three keys(mentioned) are also part of the join. When I ran this job for the first time with 1 lakh records there was no problem but when I am doing it with 10 lakh records I am getting 2 extra records after the inner join.Please help me on this.Also I have mentioned the partitioning to be same in the inner join stage i.e, on three keys as it was done in previous job.
I am joining two datasets thro an inner join.
These two datasets are previously sorted and partitioned on 3 keys in another job.
In the INNER JOIN job I am doing some transformations to one of the dataset and then doing an inner join. I am now doing a join on 5 keys and the three keys(mentioned) are also part of the join. When I ran this job for the first time with 1 lakh records there was no problem but when I am doing it with 10 lakh records I am getting 2 extra records after the inner join.Please help me on this.Also I have mentioned the partitioning to be same in the inner join stage i.e, on three keys as it was done in previous job.