lookup with multiple columns

vsi · Post by **vsi** » Wed Jan 03, 2007 9:10 am

DSguru2B wrote: What i am doing here is checking if the value is '*', i am replacing it with the value present in your database table keyed on Col1. Now you see why i asked for a single column that can never be an *.
After handling any occurance of * i am doing a join on all the seven columns and getting column 8. The driver table is your temp table that now contains the contents of your file.

Thanks DSguru2B,
here i have a problem
from the above logic, first its checking the join conditions and then assigning the values

DSguru2B · Post by **DSguru2B** » Wed Jan 03, 2007 9:25 am

Hmm. Well, i guess the only other way is to build a dataset using the query and do the join in datastage on all seven columns. Yea, do that, build the dataset with the following query

Code: Select all

SELECT F.col1, 
      (CASE F.col2 
           WHEN '*' THEN (SELECT sub_D.col2 from DataBaseTable sub_D  
                          where sub_D.col1 = F.col1) 
           ELSE F.col2) F.finCol2 

      (CASE F.col3 
           WHEN '*' THEN (SELECT sub_D.col3 from DataBaseTable sub_D 
                          where sub_D.col1 = F.col1) 
           ELSE F.col2) F.finCol3 
          
      (CASE F.col4 
           WHEN '*' THEN (SELECT sub_D.col4 from DataBaseTable sub_D 
                          where sub_D.col1 = F.col1) 
           ELSE F.col2) F.finCol4 

      (CASE F.col5 
           WHEN '*' THEN (SELECT sub_D.col5 from DataBaseTable sub_D 
                          where sub_D.col1 = F.col1) 
           ELSE F.col2) F.finCol5 

      (CASE F.col6 
           WHEN '*' THEN (SELECT sub_D.col6 from DataBaseTable sub_D 
                          where sub_D.col1 = F.col1) 
           ELSE F.col2) F.finCol6 

      (CASE F.col7 
           WHEN '*' THEN (SELECT sub_D.col7 from DataBaseTable sub_D 
                          where sub_D.col1 = F.col1) 
           ELSE F.col2) F.finCol7 
    
From TempTableFromFile F

In datastage, do a join on all the seven columns and retrieve the 8th one.

vsi · Post by **vsi** » Fri Jan 05, 2007 9:19 am

How to handle if we get more than one match for col1(temp table) to col1(database table). col1(database table) can have repeated values.

example:

temp table 1001 * 200 * 400 500 600

database table 1001 100 200 300 400 500 600 2001
1001 150 200 350 400 500 600 2002

Its showing an message like, subquery can't return more than one value.
Result is no rows selected.

Thanks

DSguru2B · Post by **DSguru2B** » Fri Jan 05, 2007 9:41 am

Well thats for you to decide. You can go with either max or min. Change the subquery accordingly. A part of it is as follows

Code: Select all

SELECT F.col1, 
      (CASE F.col2 
           WHEN '*' THEN (SELECT MIN(sub_D.col2) from DataBaseTable   
            sub_D  where sub_D.col1 = F.col1
            group by sub_D.col1) 
           ELSE F.col2) F.finCol2 
...... and so on

vsi · Post by **vsi** » Fri Jan 05, 2007 9:45 am

DSguru2B wrote: You can go with either max or min. Change the subquery accordingly.

i need all the matches, means i have to retrieve 2001 and 2002 from(col8).

DSguru2B · Post by **DSguru2B** » Fri Jan 05, 2007 10:25 am

How many values can there be, can it be more than 2? This is getting really ugly. Start thinking of an alternative route.

us1aslam1us · Post by **us1aslam1us** » Fri Jan 05, 2007 10:37 am

I don't want to get into this but VSI can you again tell us what exactly should be your final output format.

vsi · Post by **vsi** » Fri Jan 05, 2007 11:20 am

DSguru2B wrote:How many values can there be, can it be more than 2?

yes, it can have more than 2.

vsi · Post by **vsi** » Fri Jan 05, 2007 11:24 am

us1aslam1us wrote:I don't want to get into this but VSI can you again tell us what exactly should be your final output format.

temp table
1001 * 200 * 400 500 600

database table
1001 100 200 300 400 500 600 2001
1001 150 200 350 400 500 600 2002

output should be
1001 100 200 300 400 500 600 2001
1001 150 200 350 400 500 600 2002

DSguru2B · Post by **DSguru2B** » Fri Jan 05, 2007 11:39 am

In your temp table, the first column, can that have duplicates?

vsi · Post by **vsi** » Sun Jan 07, 2007 1:22 am

DSguru2B wrote:In your temp table, the first column, can that have duplicates?

yes, it can

DSguru2B · Post by **DSguru2B** » Sun Jan 07, 2007 2:37 am

Well, right now i am running out of ideas. The only other thing that i can think of is doing everything via a script. A nasty shell/awk script. Start thinking in those lines. If i have time on monday i can work on it, but no promises.

vsi · Post by **vsi** » Tue Jan 09, 2007 8:30 pm

DSguru2B wrote:Well, right now i am running out of ideas. The only other thing that i can think of is doing everything via a script. A nasty shell/awk script. Start thinking in those lines. If i have time on monday i can work on it, but no promises.

DSguru2B,
Could you please give me some suggestions for writing shell script.

Thanks a lot for your help.

DSguru2B · Post by **DSguru2B** » Tue Jan 09, 2007 8:35 pm

Sure. Read a record from your source file, get the first column, find all records that match that col1 in the second file, lets call it dbfile. Now take one row at a time and do check each and every column, if all columns match well and good, write it to your output file, if not then check if the failed columns are *, if yes then replace it with the value from the db file. Repeat this process for each row from your source. Its not all that hard. I am just a little busy this week, if i find time, i might write it for you. But no promises, start working on it.

ray.wurlod · Post by **ray.wurlod** » Tue Jan 09, 2007 11:41 pm

From the Dictionary of Computer Terms

Recursion [n]: see Recursion

You are going to have to create something - whether a job sequence, a routine, whatever - to process this Data Set over and over until there are no more child links.