Matching records between sources

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
csri
Participant
Posts: 99
Joined: Wed Jul 11, 2007 8:02 am

Matching records between sources

Post by csri »

Hello Gurus, I need your once again. Here are the details.

1. I have a table with names in an Oracle db table and the names may contain space or comma or apostrophe or a dash
2. I have an input file to lookup with four character names and may contain space or comma or apostrophe or a dash.
3. In the output, I am looking for all the records from the database where the first four charters in the database table field match
with the input dataset after removing space or comma or apostrophe or a dash from both the sources.

OraTable
|
|
|
V
Xmer---------->OutputSeqFile
^
|
|
|
HashFile
^
|
|
|
SeqInpFile

Any help is appreciated.

Here is an example:
Table data: "MC Intosh"

Input file data:
"MC I"
"MCI"
"MCIn"

When a lookup is made into the input file for the database table record in all the three cases a match should be found.

I am looking for an approach where I can have tablefield name like inputfilefieldname||'%'

Any other suggestions is appreciated.

Thanks.
satya99
Participant
Posts: 104
Joined: Thu Nov 30, 2006 1:22 pm

Post by satya99 »

1. Trim all space's in source data
2. Trim all space's in look up
3. Create a stage variable

sv : if in.column[1,3] = lkp.column[1,3] then @true else @false

Constraint: sv

Good luck
satya
csri
Participant
Posts: 99
Joined: Wed Jul 11, 2007 8:02 am

Post by csri »

satya99 wrote:1. Trim all space's in source data
2. Trim all space's in look up
3. Create a stage variable

sv : if in.column[1,3] = lkp.column[1,3] then @true else @false

Constraint: sv

Good luck
Thanks for your reply but I don't think it works. Because if I have the following data then it will fail.

Database value as "PATTI" and lookup value as "PATI" if I trim and take only three characters to compare ("PAT" = "PAT") then they match even though they are not the same. This type of data will fail.

Thanks.
mystuff
Premium Member
Premium Member
Posts: 200
Joined: Wed Apr 11, 2007 2:06 pm

Post by mystuff »

I am looking for all the records from the database where the first four charters
Change the 1,3 in the code to 1,4 then?????????

If you are looking for dynamic join and based on your requirement.
3. In the output, I am looking for all the records from the database where the first four charters in the database table field match
with the input dataset after removing space or comma or apostrophe or a dash from both the sources.
Make use of length and substring functions while making the join
csri
Participant
Posts: 99
Joined: Wed Jul 11, 2007 8:02 am

Post by csri »

mystuff wrote:
I am looking for all the records from the database where the first four charters
Change the 1,3 in the code to 1,4 then?????????

If you are looking for dynamic join and based on your requirement.
3. In the output, I am looking for all the records from the database where the first four charters in the database table field match
with the input dataset after removing space or comma or apostrophe or a dash from both the sources.
Make use of length and substring functions while making the join
It does not work if I change from 1,3 to 1,4.
Example:
TableData: O'DEL
FileData: O'DE

But if I comapre (ODEL = ODE) does not work eventhough it is a match.

Yes, I am looking for a dynamic match but don't know how to implement.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Check the length and compare the first X characters where X is the length of the smaller of the two.
-craig

"You can never have too many knives" -- Logan Nine Fingers
csri
Participant
Posts: 99
Joined: Wed Jul 11, 2007 8:02 am

Post by csri »

chulett wrote:Check the length and compare the first X characters where X is the length of the smaller of the two. ...
Craig,

I am using a Universe stage with VOC pointer for lookup. Now, how can I make a lookup based on the length dynamically?

Thanks.
csri
Participant
Posts: 99
Joined: Wed Jul 11, 2007 8:02 am

Post by csri »

chulett wrote:Check the length and compare the first X characters where X is the length of the smaller of the two. ...
Craig,

I am using a Universe stage with VOC pointer for lookup. Now, how can I make a lookup based on the length dynamically?

Thanks.
csri
Participant
Posts: 99
Joined: Wed Jul 11, 2007 8:02 am

Post by csri »

csri wrote:
chulett wrote:Check the length and compare the first X characters where X is the length of the smaller of the two. ...
Craig,

I am using a Universe stage with VOC pointer for lookup. Now, how can I make a lookup based on the length dynamically?

Thanks.
To achieve the same results, I have performed lookup multiple times, one with three chars and another with four chars. This is not the best way to do but to get the desired results, I had to do this. But, it was taking long time to complete running. So, I have redesigned the job using the Oracle tables and indexes. I have created function based indexes on oracle tables and written a query to get the results. This works much faster.

Hope this helps.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

All that really matters is if it helps you. :wink:

Thanks for the closure, though, future searchers thank you. From the future. Later.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply