Look up multiple columns
Moderators: chulett, rschirm, roy
Look up multiple columns
Hi,
I have a file(CodeFile) with 3 columns Code1,Code2,Code3.
Eg.
CodeFile
1,2,3
2,3,4
I need to lookup the Description for each column from the same table Code_Desc. The table has 2 fields (Code, Description).
The table has values
Code,Description
1,One
2,Two
3,Three
4,Four
now i need to create a target file by looking up the Code_Desc as follows
1,One,2,Two,3,Three
2,Two,3,Three,4,Four
Currently, I read the data from the table Code_desc using Db2 stage.
Then using Copy stage i am pass 3 reference links in the Look-up stage(One for each field)
Is there any other way of doing this?
What should I do if I need to lookup 17 fields instead of 3 fields (Do i need to have 17 reference links in this case?)?
Thanks in advance
Senthil
I have a file(CodeFile) with 3 columns Code1,Code2,Code3.
Eg.
CodeFile
1,2,3
2,3,4
I need to lookup the Description for each column from the same table Code_Desc. The table has 2 fields (Code, Description).
The table has values
Code,Description
1,One
2,Two
3,Three
4,Four
now i need to create a target file by looking up the Code_Desc as follows
1,One,2,Two,3,Three
2,Two,3,Three,4,Four
Currently, I read the data from the table Code_desc using Db2 stage.
Then using Copy stage i am pass 3 reference links in the Look-up stage(One for each field)
Is there any other way of doing this?
What should I do if I need to lookup 17 fields instead of 3 fields (Do i need to have 17 reference links in this case?)?
Thanks in advance
Senthil
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Hi,
May be something better for you would be to replace "," by a end of line (I think it is char(013)). You need first to flag each line individualy with Surrogate_Key_Generator stage, then you obtain a file like this :
Surrogate CodeFile
1 1
1 2
1 3
2 2
2 3
2 4
Instead of
CodeFile
1,2,3
2,3,4
Then you can use a LookUp on a single column.
1 1 One
1 2 Two
1 3 Three
2 2 Two
2 3 Three
2 4 Four
And to finish, use a stage variable to identify the columns who belong from the same line in the input file (same Surrogate) and separate the record with ",".
Be carrefull, I think you must partition by "Surrogate".
That way you can have as many column as you want.
Have fun.
Matthieu.
May be something better for you would be to replace "," by a end of line (I think it is char(013)). You need first to flag each line individualy with Surrogate_Key_Generator stage, then you obtain a file like this :
Surrogate CodeFile
1 1
1 2
1 3
2 2
2 3
2 4
Instead of
CodeFile
1,2,3
2,3,4
Then you can use a LookUp on a single column.
1 1 One
1 2 Two
1 3 Three
2 2 Two
2 3 Three
2 4 Four
And to finish, use a stage variable to identify the columns who belong from the same line in the input file (same Surrogate) and separate the record with ",".
Be carrefull, I think you must partition by "Surrogate".
That way you can have as many column as you want.
Have fun.
Matthieu.
Great ideampouet wrote:Hi,
May be something better for you would be to replace "," by a end of line (I think it is char(013)). You need first to flag each line individualy with Surrogate_Key_Generator stage, then you obtain a file like this :
Surrogate CodeFile
1 1
1 2
1 3
2 2
2 3
2 4
Instead of
CodeFile
1,2,3
2,3,4
Then you can use a LookUp on a single column.
1 1 One
1 2 Two
1 3 Three
2 2 Two
2 3 Three
2 4 Four
And to finish, use a stage variable to identify the columns who belong from the same line in the input file (same Surrogate) and separate the record with ",".
Be carrefull, I think you must partition by "Surrogate".
That way you can have as many column as you want.
Have fun.
Matthieu.
![Idea :idea:](./images/smilies/icon_idea.gif)
In simple, Use a pivot stage to make in single column and then do a lookup. Later do a vertical pivot to get the original structure back... Hmm... may be worth trying.
Pivot is also a resource consuming stage as like Lookup.
Hi senthil,
Better do a test on both the approach and find the best option. Share to use if possible.
If you are not aware of vertical pivot do a serch on the same key word.
![Wink :wink:](./images/smilies/icon_wink.gif)
-Kumar
Hi,
The two big good points with my method are :
- Your file can be "unstructured". It means you can have a variable number of fields. For example :
1,2,3,4
4,5
8,10,20,21,12,12,14,21
- You don't have to modify your developpment if some fields have to be added. That's a good point isn't it !
For information, we tried once to make a job with 40 LookUp (each with one line). In fact we needed 150 LookUp, but we cut the job in 4 jobs. The performance slowed down dramaticaly : 100 rows/sec.
As only one line was in each (150) LookUp. I managed it with the job parameters and obtained 19000 rows/sec.
Conclusion : avoid too much LookUp...
Bye
Matthieu
The two big good points with my method are :
- Your file can be "unstructured". It means you can have a variable number of fields. For example :
1,2,3,4
4,5
8,10,20,21,12,12,14,21
- You don't have to modify your developpment if some fields have to be added. That's a good point isn't it !
For information, we tried once to make a job with 40 LookUp (each with one line). In fact we needed 150 LookUp, but we cut the job in 4 jobs. The performance slowed down dramaticaly : 100 rows/sec.
As only one line was in each (150) LookUp. I managed it with the job parameters and obtained 19000 rows/sec.
Conclusion : avoid too much LookUp...
Bye
Matthieu
hi,
i am new to datastage.
i was trying out this solution provided by Matthieu & i'm facing a problem. plz help me out.
to convert the transformed file back to its original form, i used 2 stage variables, StgPrev & stgCurr for storing previous & current values of the Surrogate key & the following derivation to get the output in a single column. i used "," to seperate rows having s=different Surrogate key
If StgPrev = StgCurr then DSLink25.code:"|" : DSLink25.code_desc Else "," :DSLink25.code:"|" : DSLink25.code_desc
& in the output file set the record delimiter as "|".
i am getting the following output
1|One|2|Two|3|Three|,2|Two|3|Three|4|Four|
when i use char(013) i get the following output & wc -l gives 0
1|One|2|Two|3|Three|
2|Two|3|Three|4|Four|
How can I specifiy Unix newline so that i may get the correct value of wc -l & the required output.
Thanks & Regards
r
i am new to datastage.
i was trying out this solution provided by Matthieu & i'm facing a problem. plz help me out.
to convert the transformed file back to its original form, i used 2 stage variables, StgPrev & stgCurr for storing previous & current values of the Surrogate key & the following derivation to get the output in a single column. i used "," to seperate rows having s=different Surrogate key
If StgPrev = StgCurr then DSLink25.code:"|" : DSLink25.code_desc Else "," :DSLink25.code:"|" : DSLink25.code_desc
& in the output file set the record delimiter as "|".
i am getting the following output
1|One|2|Two|3|Three|,2|Two|3|Three|4|Four|
when i use char(013) i get the following output & wc -l gives 0
1|One|2|Two|3|Three|
2|Two|3|Three|4|Four|
How can I specifiy Unix newline so that i may get the correct value of wc -l & the required output.
Thanks & Regards
r
How are you viewing your file? If you put a line feed in there it might not show correctly. If your "wc -l" doesn't show 2 then something went wrong with inserting CHAR(10). I just answered your technical question - but what is your goal and what is the problem?r wrote:when i use line feed the output is as follows: 1|One|2|Two|3|Three2|Two|3|Three|4|Four|
when i use carriage return, the output is as follows:
1|One|2|Two|3|Three|
{spaces}2|Two|3|Three|4|Four|
the problem still exists.
Regards
r
hi,
i have to perform multiple lookup. i have a source file of the form:
key
origin country
trans country
destination country
All the 3 country fields contain country codes
The reference file contains country code & country name.
i perform a lookup using the above method. But after the lookup all rows get shuffled up due to partitioning. i am using hash partitioning on the primary input & entire partitioning on the reference. For eg.
Reference file
US United States
IN India
GB Great Britain
JP Japan
Source file:
1|US|GB|IN
2|JP|IN|GB
Using the method described in this post, after the lookup & after sorting on the surr_key & key i may get something like this:
Surr_key|key|origincountry|trans country|dest country
0|1|US|United States
0|1|IN|India
0|1|GB|Great Britain
1|2|IN|India
1|2|GB|Great Britain
1|2|JP|Japan
Here if i perform vertical pivot it will shuffled up values for origin, trans, destination countries.
The only way i get back the original order is using SEQUENTIAL execution mode for the lookup but using defeats the whole purpose of partitioning & fast execution.
Please suggest a soluiton.
Thnaks n Regards
r
i have to perform multiple lookup. i have a source file of the form:
key
origin country
trans country
destination country
All the 3 country fields contain country codes
The reference file contains country code & country name.
i perform a lookup using the above method. But after the lookup all rows get shuffled up due to partitioning. i am using hash partitioning on the primary input & entire partitioning on the reference. For eg.
Reference file
US United States
IN India
GB Great Britain
JP Japan
Source file:
1|US|GB|IN
2|JP|IN|GB
Using the method described in this post, after the lookup & after sorting on the surr_key & key i may get something like this:
Surr_key|key|origincountry|trans country|dest country
0|1|US|United States
0|1|IN|India
0|1|GB|Great Britain
1|2|IN|India
1|2|GB|Great Britain
1|2|JP|Japan
Here if i perform vertical pivot it will shuffled up values for origin, trans, destination countries.
The only way i get back the original order is using SEQUENTIAL execution mode for the lookup but using defeats the whole purpose of partitioning & fast execution.
Please suggest a soluiton.
Thnaks n Regards
r
What is the partiton key used?
What is the order of the vlaues after the pivoting is performed. Write into a file and check if it is in desired order. I hope Pivoting operation is performed in Sequential mode.
What is the order of the vlaues after the pivoting is performed. Write into a file and check if it is in desired order. I hope Pivoting operation is performed in Sequential mode.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'