lookup with multiple columns

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

vsi
Premium Member
Premium Member
Posts: 507
Joined: Wed Mar 15, 2006 1:44 pm

Post by vsi »

DSguru2B wrote: What i am doing here is checking if the value is '*', i am replacing it with the value present in your database table keyed on Col1. Now you see why i asked for a single column that can never be an *.
After handling any occurance of * i am doing a join on all the seven columns and getting column 8. The driver table is your temp table that now contains the contents of your file.
Thanks DSguru2B,
here i have a problem
from the above logic, first its checking the join conditions and then assigning the values
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Hmm. Well, i guess the only other way is to build a dataset using the query and do the join in datastage on all seven columns. Yea, do that, build the dataset with the following query

Code: Select all

SELECT F.col1, 
      (CASE F.col2 
           WHEN '*' THEN (SELECT sub_D.col2 from DataBaseTable sub_D  
                          where sub_D.col1 = F.col1) 
           ELSE F.col2) F.finCol2 

      (CASE F.col3 
           WHEN '*' THEN (SELECT sub_D.col3 from DataBaseTable sub_D 
                          where sub_D.col1 = F.col1) 
           ELSE F.col2) F.finCol3 
          
      (CASE F.col4 
           WHEN '*' THEN (SELECT sub_D.col4 from DataBaseTable sub_D 
                          where sub_D.col1 = F.col1) 
           ELSE F.col2) F.finCol4 

      (CASE F.col5 
           WHEN '*' THEN (SELECT sub_D.col5 from DataBaseTable sub_D 
                          where sub_D.col1 = F.col1) 
           ELSE F.col2) F.finCol5 

      (CASE F.col6 
           WHEN '*' THEN (SELECT sub_D.col6 from DataBaseTable sub_D 
                          where sub_D.col1 = F.col1) 
           ELSE F.col2) F.finCol6 

      (CASE F.col7 
           WHEN '*' THEN (SELECT sub_D.col7 from DataBaseTable sub_D 
                          where sub_D.col1 = F.col1) 
           ELSE F.col2) F.finCol7 
    
From TempTableFromFile F
In datastage, do a join on all the seven columns and retrieve the 8th one.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
vsi
Premium Member
Premium Member
Posts: 507
Joined: Wed Mar 15, 2006 1:44 pm

Post by vsi »

How to handle if we get more than one match for col1(temp table) to col1(database table). col1(database table) can have repeated values.

example:

temp table 1001 * 200 * 400 500 600

database table 1001 100 200 300 400 500 600 2001
1001 150 200 350 400 500 600 2002

Its showing an message like, subquery can't return more than one value.
Result is no rows selected.

Thanks
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Well thats for you to decide. You can go with either max or min. Change the subquery accordingly. A part of it is as follows

Code: Select all

SELECT F.col1, 
      (CASE F.col2 
           WHEN '*' THEN (SELECT MIN(sub_D.col2) from DataBaseTable   
            sub_D  where sub_D.col1 = F.col1
            group by sub_D.col1) 
           ELSE F.col2) F.finCol2 
...... and so on
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
vsi
Premium Member
Premium Member
Posts: 507
Joined: Wed Mar 15, 2006 1:44 pm

Post by vsi »

DSguru2B wrote: You can go with either max or min. Change the subquery accordingly.
i need all the matches, means i have to retrieve 2001 and 2002 from(col8).
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

How many values can there be, can it be more than 2? This is getting really ugly. Start thinking of an alternative route.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

I don't want to get into this but VSI can you again tell us what exactly should be your final output format.
I haven't failed, I've found 10,000 ways that don't work.
Thomas Alva Edison(1847-1931)
vsi
Premium Member
Premium Member
Posts: 507
Joined: Wed Mar 15, 2006 1:44 pm

Post by vsi »

DSguru2B wrote:How many values can there be, can it be more than 2?
yes, it can have more than 2.
vsi
Premium Member
Premium Member
Posts: 507
Joined: Wed Mar 15, 2006 1:44 pm

Post by vsi »

us1aslam1us wrote:I don't want to get into this but VSI can you again tell us what exactly should be your final output format.
temp table
1001 * 200 * 400 500 600

database table
1001 100 200 300 400 500 600 2001
1001 150 200 350 400 500 600 2002

output should be
1001 100 200 300 400 500 600 2001
1001 150 200 350 400 500 600 2002
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

In your temp table, the first column, can that have duplicates?
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
vsi
Premium Member
Premium Member
Posts: 507
Joined: Wed Mar 15, 2006 1:44 pm

Post by vsi »

DSguru2B wrote:In your temp table, the first column, can that have duplicates?
yes, it can
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Well, right now i am running out of ideas. The only other thing that i can think of is doing everything via a script. A nasty shell/awk script. Start thinking in those lines. If i have time on monday i can work on it, but no promises.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
vsi
Premium Member
Premium Member
Posts: 507
Joined: Wed Mar 15, 2006 1:44 pm

Post by vsi »

DSguru2B wrote:Well, right now i am running out of ideas. The only other thing that i can think of is doing everything via a script. A nasty shell/awk script. Start thinking in those lines. If i have time on monday i can work on it, but no promises.
DSguru2B,
Could you please give me some suggestions for writing shell script.

Thanks a lot for your help.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Sure. Read a record from your source file, get the first column, find all records that match that col1 in the second file, lets call it dbfile. Now take one row at a time and do check each and every column, if all columns match well and good, write it to your output file, if not then check if the failed columns are *, if yes then replace it with the value from the db file. Repeat this process for each row from your source. Its not all that hard. I am just a little busy this week, if i find time, i might write it for you. But no promises, start working on it.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

From the Dictionary of Computer Terms

Recursion [n]: see Recursion


You are going to have to create something - whether a job sequence, a routine, whatever - to process this Data Set over and over until there are no more child links.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply