Problem with Inner Join

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
nynali
Participant
Posts: 31
Joined: Thu May 03, 2007 11:52 pm
Location: Hyderabad

Problem with Inner Join

Post by nynali »

Hi,
I am joining two datasets thro an inner join.
These two datasets are previously sorted and partitioned on 3 keys in another job.
In the INNER JOIN job I am doing some transformations to one of the dataset and then doing an inner join. I am now doing a join on 5 keys and the three keys(mentioned) are also part of the join. When I ran this job for the first time with 1 lakh records there was no problem but when I am doing it with 10 lakh records I am getting 2 extra records after the inner join.Please help me on this.Also I have mentioned the partitioning to be same in the inner join stage i.e, on three keys as it was done in previous job.
nynali
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Make the partitioning specific. Don't rely on (Auto) to preserve previous partitioning - it is just as likely to repartition using round robin.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply