Page 1 of 1

Lookup - different scenario

Posted: Tue May 13, 2008 6:39 am
by prematram
Hi,

i need solution for the following scenario:

i am having one main file and reference file. I need to lookup data based on specific value which may be in any column, not based on specific column...

ie in lookup stage what happens is we give a key column with which matched records w.r.t that column is taken as output.

but the secanario here is we have to take matched records based on specific value that may present in any column.

whether this is possible...??

Posted: Tue May 13, 2008 8:46 am
by ArndW
It is possible. There are a couple of ways you can do this, but the optimal solution depends upon the number of columns and the type of data file (is it a table or a sequential file?) as well as the file size and what database you are using.

Posted: Tue May 13, 2008 12:43 pm
by ag_ram
There are many ways as mentioned earlier, if you happen to present the clear metadata of 'reference file'.(especially, the same Datatype of the Columns, Number of Columns)

Posted: Tue May 13, 2008 4:57 pm
by ray.wurlod
Actually, it's not really related to the metadata at all. The requirement, if I read it correctly, is to find a row based on looking for a "search key" value in any column in that row. The only metadata limitation would be one of compatible data types. As Arnd said, the optimal solution will depend on just how many columns (and rows) there are in the reference data, and how those data are stored.

Posted: Wed May 14, 2008 12:32 am
by ag_ram
Correct ray.wurlod,

A small question,
ray.wurlod wrote:....and how those data are stored
Could you please expand - How this can be one of the dependencies?

Posted: Wed May 14, 2008 3:03 am
by just4geeks
Hi ,

As I understand the problem, the sample data may be like this:

Input data:

IpCol1 Ipcol2
a 1
b 2
c 3
d 4
yy 5

Reference data:
Refcol1 Refcol2 Refcol3 refcol4 ....
a aa aaa aaaa
x b xx xxx
y z c yy

Lets say we are looking up for Ipcol1 of Input data then our output should be
(Ipcol1 is mapped to Opcol1 and Ipcol2 to Opcol2)

Opcol1 Opcol2
a 1
b 2
c 3
yy 5

Please correct me if I am wrong. But still I can't understand how this logic can be implemented?

Thanks

Posted: Wed May 14, 2008 4:12 am
by ray.wurlod
Short answer - you have more options for being selective with the data if they are in a table compared to if they are in a sequential file.

Posted: Wed May 14, 2008 4:34 am
by aakashahuja
This is something I would do although I am sure that others would have better ways to suggest on this:

Lets say I have 5 columns in data set to be referenced, say col1, col2, col3,col4,col5.
I would :
1. Concatenate all the columns together using a separator that doesnt exist the data set which is to be looked up. For ex col1@col2@col3@col4@col5.
2. This gives me 1 single column. Now using some string functions, I would find the presence of the source value in the big concatenated string.

Cheers
Aakash