Hi I have some problem with join,lookup and merge stages. I am giving my requirement below.
I have two seq files each of which are having one column with name ROWNUM, and in both the file data is 3 records like below
100
100
100
if i use join stage to join these two files and performing innerjoin and left outerjoin, for both operations it is giving out put 9 records like.
100
100
100
100
100
100
100
100
100
if i use Lookup stage and given options as drop, continue. for both the operations it is giving out put 3 records.
100
100
100
and if use merge stage and given option as drop then it is giving 3 records as output like
100
100
100
and if i use option as keep then it is giving 5 records as output like.
100
100
100
100
100
I think all the 3 stages should produce same result why the 3 stages join, lookup and merge are giving different results.
Can any one help me please.
Difference Between join, Lookup and merge
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 1
- Joined: Wed Oct 13, 2010 10:30 am
Difference Between join, Lookup and merge
AnilKumar K
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
For starters, Merge consumes rows from its reference (update) input(s). The other two stage types don't. Your Keep rule iss giving variations because you end up with unmatched masters after all the reference rows have been consumed.
So what do you actually want the output to be? Both results you got are correct, because the specifications were different.
So what do you actually want the output to be? Both results you got are correct, because the specifications were different.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 135
- Joined: Tue Aug 14, 2007 4:27 am
- Location: Mumbai
to simplify you are joining , merging , performing lookup with two identical files..... or two same files ...
if u use two different files you will see some more magic ..
if they produce same results then why we had three diff stages...
in some cases they may produce same results ..
Join -> will give you cartesian product( set of matching recs ). as ur file is identical its gives you 3*3 = 9 rows.
Merge-> reference record is consumed ( once a matching ip found )
typically used to populate master -child records ... and when you need non matching records from reference ( as it allows a reject link )
Lookup -> no cartesian product ... to populate a column based on matching keys
if u use two different files you will see some more magic ..
if they produce same results then why we had three diff stages...
in some cases they may produce same results ..
Join -> will give you cartesian product( set of matching recs ). as ur file is identical its gives you 3*3 = 9 rows.
Merge-> reference record is consumed ( once a matching ip found )
typically used to populate master -child records ... and when you need non matching records from reference ( as it allows a reject link )
Lookup -> no cartesian product ... to populate a column based on matching keys
Thanks
Swapnil
"Whenever you find whole world against you just turn around and Lead the world"
Swapnil
"Whenever you find whole world against you just turn around and Lead the world"
-
- Participant
- Posts: 63
- Joined: Fri Jun 22, 2007 7:27 am
- Location: Kolkata
-------------anilkumark802 wrote:Hi I have some problem with join,lookup and merge stages. I am giving my requirement below.
I have two seq files each of which are having one column with name ROWNUM, and in both the file data is 3 records like below
100
100
100
if i use join stage to join these two files and performing innerjoin and left outerjoin, for both operations it is giving out put 9 records like.
100
100
100
100
100
100
100
100
100
if i use Lookup stage and given options as drop, continue. for both the operations it is giving out put 3 records.7
100
100
100
and if use merge stage and given option as drop then it is giving 3 records as output like
100
100
100
and if i use option as keep then it is giving 5 records as output like.
100
100
100
100
100
I think all the 3 stages should produce same result why the 3 stages join, lookup and merge are giving different results.
Can any one help me please.
Good morning my dear friend
I think that it's only a logical problem and I try to explain it by an example. We have two tables with tree rows in the first and three rows in the second one.
Table One Table Two
Column A Column B
100 100
100 100
100 100
I develop the following SQL query
SELECT COLUMN A FROM One, Two WHERE A = B
For each existing row in Table One the query macth three rows satisfating the predicate A=B. So for the first row in table one match three rows in table two. This is the reason explaining
an output with nine rows. The Join Stage works in the same way (this is true both for the inner and left join).
The lookup stage instead works differently. It reads n rows and produce exactly the same number of row in output (if they exist in the reference link).
About the tirth Stage, the Merge one, I don't know it very well but I suppose that in the first case it distincts (aggregates) the rows and not in the second.
I hope that my explanation are clear.
Have a nice day
Umberto