about reference table and driving table

ranga1970 · Post by **ranga1970** » Wed Dec 01, 2004 9:59 am

Hi folks, :D

Here is my question, I was opininion that

if you use stream link and reference link, which means we are making an leftouter join(ie passing all values from stram link and matching rows only from reference link)

But my data is like this, reference table is having composit key and the hence the join condition key is duplicable

for example

stream link

SNO empno
1 101
2 102
3 103
5 107
6 109

Reference link
Sno deptno empname
1 1 abc
1 2 bad
2 1 xyz
2 2 abd

if make join in datastage on sno of both, out put would be only 5 rows since you wil have records not more than stream records total records

but if you make left outer join using sql you would 7 records, since Sno 1 and 2 will have two records each and 3, 5, 6 will have one record

so the testing department says its failure,

one method i figured out is having only one stage and making leftouter join at the stage out put SQL, but my question is does the Datastage provide any solution for this kind of data, can have multiple primary key table as reference at all

Could some one throw light on me regarding this and what exactly happens in the background when stream link and reference link or joined

If some could help me in this you would be doing great help, help requested with utmost urgency please :D

[/code]

kcbland · Post by **kcbland** » Wed Dec 01, 2004 10:03 am

It does but that's not your problem. Your problem is that your design is wrong. You've assumed that the SQL is joining one way, but you need to cogitate about relational theory and realize what you think is the stream is really being handled as the "reference" vice versa.

Change your stream link to the reference and your reference to the stream.

Now, when you process the stream, you get a 1:1 reference lookup and everything is correct.

ranga1970 · Post by **ranga1970** » Wed Dec 01, 2004 10:07 am

i did try that my stream is also of multiple keys but different
here my source and reference both are multiple keys
source primary keys are different from reference and i can join only one

kcbland · Post by **kcbland** » Wed Dec 01, 2004 10:09 am

Maybe you should post your actual data instead of the emp table example. That way we can talk about apples instead of looking at oranges.

ranga1970 · Post by **ranga1970** » Wed Dec 01, 2004 10:12 am

Well i am sorry for not being precise in my first post

is there any way to do that kind of join in datastage betwen two table having multiple primary keys

ranga1970 · Post by **ranga1970** » Wed Dec 01, 2004 10:13 am

Well i am sorry for not being precise in my first post

is there any way to do that kind of join in datastage betwen two table having multiple primary keys

kcbland · Post by **kcbland** » Wed Dec 01, 2004 10:15 am

Yes but in different ways. If your data is two tables in the same source instance you may wish to just use SQL to solve the problem.

If your data is in two sequential files you can use the Merge stage. You could also build hash files to accomplish what you wish, using a UV/ODBC stage with multi-row return set enabled on the reference lookup.

Without a little bit more information, like a sampling of your actual data, it's kind of hard to give you a best recommendation... HINT

ranga1970 · Post by **ranga1970** » Wed Dec 01, 2004 10:20 am

I got the knowledge, really i solved the problem using the sql since its from same stage but i waqnt to have some knowledge, i got the tip and i will go find out from help how to use merge stage for the future usage

thanks thanks thanks :D

chucksmith · Post by **chucksmith** » Wed Dec 01, 2004 10:21 am

It is a little involved, but you can denormalize your reference file into a hash file with multi-value columns.

This will now return one denormalized row from the lookup. Write the contents of your stream and reference rows to another hash file.

Finally, read from this new hash file, selecting the appropriate column from the Normalize On drop box. Output the normalize rows.

Based on some comments in other posts, you may have to dummy values where the original reference lookup fails.

As Ken said, we could be more specific about a solution with more exact data samples, but this is a workable solution.

leomauer · Post by **leomauer** » Wed Dec 01, 2004 10:39 am

Another trick. Also a little involved:
1. Unload your hash file into flat file format same as the input file.
2. Concatenate input file to this unload file (input file last).
3. Load the result into new hash file with the key to eliminate duplicates(this will keep input file row where it duplicates the other file row).
4. Use this hash file as a stream input to your look up.
5. Of course the rows left from the original hash file will match themselves but then the code can blank out the left side of the matched row thus imitating left outer join.
Just a trick.