Join output problem

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Join output problem

Post by ady »

I am joining 2 tables with a "left inner join" , The key combination I am using is promo_id and offer_Id.

Table A(Left) has 117500 rows, Table B(Right) has 1490 rows. But I get 274470 rows as output ?. I really dont understand?..... can anyone suggest what seems to be the problem here ?

Thnx
samsuf2002
Premium Member
Premium Member
Posts: 397
Joined: Wed Apr 12, 2006 2:28 pm
Location: Tennesse

Post by samsuf2002 »

Try to sort the data before you do join
hi sam here
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There's no such thing as a left inner join. Are you using a left outer join, or an inner join?

An inner join will return a number of rows no greater than the smaller of the row counts of the two inputs. A left outer join will return a number of rows no greater than the number of rows in the left input.

Only a full outer join can return more rows than either of the input Data Sets (but see next paragraph).

The Join stage requires its inputs to be identically partitioned and sorted (based on the join key(s)). Without this condition being satisfied, you are likely to get spurious additional matches.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

As noted there is nothing like 'left inner join'. you might be trying 'left outer join'. you need to sort the data on the 'keys' and also check how the data is partitioned.
I haven't failed, I've found 10,000 ways that don't work.
Thomas Alva Edison(1847-1931)
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Post by ady »

I was using a "Left Outer Join" :oops:

I am using the inbuilt sort (Ascending, Case Sensitive) and the Partion type is "Hash"
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

What is your job design? Wrong partition will screw up the number of records that been passed. Are the both stream well partitioned on the key (as you mentioned 'Hash')?
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are both inputs identically hash partitioned and sorted (on the join keys)?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Post by ady »

The data I have dosent make sense .... I need to recheck it ! :?

Thanx for the help guys !
Post Reply