Lookup where ID is within the range of 2 columns

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Nicole
Premium Member
Premium Member
Posts: 93
Joined: Mon Oct 22, 2007 12:55 pm
Location: Albany
Contact:

Lookup where ID is within the range of 2 columns

Post by Nicole »

I have a file which has an id field and then I have a database table which has begin_id and end_id fields. I want to create a job which will get each ID from the file and then go to the table and find the row where the id from the file is in between the begin_id and end_id and then get a value from another column in that table and update the file with this value -- seems a little tricky.

Any suggestions on how I can do this type of a lookup?

Thanks!!!

Nicole
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

That's a "range lookup" and if you do an exact search on that phrase you should find all kinds of conversations on the topic.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Nicole
Premium Member
Premium Member
Posts: 93
Joined: Mon Oct 22, 2007 12:55 pm
Location: Albany
Contact:

Post by Nicole »

Thank you...as I was waiting for a reply, I was playing around with the lookup stage and noticed the range option, but I wasn't sure if that was what I wanted, but now I see it is!!!

Thanks so much!
chulett wrote:That's a "range lookup" and if you do an exact search on that phrase you should find all kinds of conversations on the topic. ...
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There are two different kinds of range lookup - one where the range is on the stream input, the other where the range is on the reference input. Be careful how you specify it.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Nicole
Premium Member
Premium Member
Posts: 93
Joined: Mon Oct 22, 2007 12:55 pm
Location: Albany
Contact:

Re: Lookup where ID is within the range of 2 columns

Post by Nicole »

I need to elaborate on this a little.

From the file I explained I have an id field and then I have a database table which has begin_id and end_id fields. I used the range lookup to find this. There is also a sequence number column from the table. In order to find the right row to lookup, I need to subtract the Beg_id from the id which will give me the sequence number, and then I will have the correct row. Can I do this extra step anywhere in the lookup?

WHERE (FILE.ID >= TABLE.BEG_ID AND FILE.ID <= TABLE.END_ID) AND FILE.ID - TABLE.BEG_ID = TABLE.SEQ_NMBR

THANKS!!!
Nicole wrote:I have a file which has an id field and then I have a database table which has begin_id and end_id fields. I want to create a job which will get each ID from the file and then go to the table and find the row where the id from the file is in between the begin_id and end_id and then get a value from another column in that table and update the file with this value -- seems a little tricky.

Any suggestions on how I can do this type of a lookup?

Thanks!!!

Nicole
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

No, it has to be done before the lookup. That is, in an upstream Transformer stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Nicole
Premium Member
Premium Member
Posts: 93
Joined: Mon Oct 22, 2007 12:55 pm
Location: Albany
Contact:

Post by Nicole »

Hmm.....I'm just not sure how I can do that, since I need data from the lookup of the table and data from the file together.

What exactly is an 'upstream transformer' stage? I tried to find posts which would be helpful, but none really described it. If I had to guess, it would be a transformer stage before the lookup, but like I said, I need data from both the input file and the table, in order to do the calculation.

Maybe this just can't be done?
ray.wurlod wrote:No, it has to be done before the lookup. That is, in an upstream Transformer stage. ...
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

To be really honest, instead of pulling your hair out and 'gotta, hafta do this in DataStage' approach, try the database approach. Load your file into a temp table in the same database instance and read your data by passing the query that contains the where predicate in your explanation.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
just4geeks
Premium Member
Premium Member
Posts: 644
Joined: Sat Aug 26, 2006 3:59 pm
Location: Mclean, VA

Post by just4geeks »

One way to accomplish this through DataStage is perform lookups using 2 lookups stages:

Use first lookup stage to perform range lookup:
WHERE (FILE.ID >= TABLE.BEG_ID AND FILE.ID <= TABLE.END_ID)

After this lookup, carry TABLE.BEG_ID and FILE.ID to next transformer stage ( along with all other columsn you want). In transformer, perform this calculation: FILE.ID - TABLE.BEG_ID = TmpSeqNmbr

Now in second lookup stage, again perform lookup with Table:
Where TmpSeqNmbr=TABLE.SEQ_NMBR

So you design would be like:

Lookupstage----Transformer-----LookupStage

Hope this helps!
Take care!
Attitude is everything....
Nicole
Premium Member
Premium Member
Posts: 93
Joined: Mon Oct 22, 2007 12:55 pm
Location: Albany
Contact:

Post by Nicole »

I think this might just work. Thank you so much for helping me think that one through! :)

I was considering other options, but I really didn't want to start something else since I had put the time into doing it this way.

Thanks,

Nicole

just4geeks wrote:One way to accomplish this through DataStage is perform lookups using 2 lookups stages:

Use first lookup stage to perform range lookup:
WHERE (FILE.ID >= TABLE.BEG_ID AND FILE.ID <= TABLE.END_ID)

After this lookup, carry TABLE.BEG_ID and FILE.ID to next transformer stage ( along with all other columsn you want). In transformer, perform this calculation: FILE.ID - TABLE.BEG_ID = TmpSeqNmbr

Now in second lookup stage, again perform lookup with Table:
Where TmpSeqNmbr=TABLE.SEQ_NMBR

So you design would be like:

Lookupstage----Transformer-----LookupStage

Hope this helps!
Take care!
Nicole
Premium Member
Premium Member
Posts: 93
Joined: Mon Oct 22, 2007 12:55 pm
Location: Albany
Contact:

Post by Nicole »

I have this job working just as I want it now, using 2 lookups, with the first one being a range lookup, except I am having a problem matching.

From the file, I have an id and from the database table, I lookup the range, but several of the ids will be within the same range, so it should match with the same row, but it seems to only match for the first one and then the rest are rejected. Am I doing something wrong?

I did see that there might be an issue with ranges on varchar, but I don't think this is it, it does seem to match, but again, it throws out the rest that should match for the same row. Any suggestions?
just4geeks
Premium Member
Premium Member
Posts: 644
Joined: Sat Aug 26, 2006 3:59 pm
Location: Mclean, VA

Post by just4geeks »

I don't have DataStage parallel in front of me but I remember there is an option in lookup stage where you can define if you want to match one or all. Go to Lookup Stage properties and select ALL option.

Thanks
Attitude is everything....
just4geeks
Premium Member
Premium Member
Posts: 644
Joined: Sat Aug 26, 2006 3:59 pm
Location: Mclean, VA

Post by just4geeks »

Here is what document says:
" If you want to impose conditions on your lookup, or want to use a reject link, you need to double-click on the Condition header of a reference link, choose Conditions from the link shortcut menu, or click the Condition toolbar button. The Lookup Stage Conditions dialog box appears. This allows you to:
1. Specify that one of the reference links is allowed to return multiple rows when performing a lookup without causing an error (choose the relevant reference link from the Multiple rows returned from link list).
2. Specify a condition for the required references. Double-click the Condition box (or press CTRL-E) to open the expression editor. This expression can access all the columns of the primary link, plus columns in reference links that are processed before this link.
3. Specify what happens if the condition is not met on each link.
4. Specify what happens if a lookup fails on each link."

This is what you need to specify in order to match all:
"Specify that one of the reference links is allowed to return multiple rows when performing a lookup without causing an error (choose the relevant reference link from the Multiple rows returned from link list). "
Attitude is everything....
Nicole
Premium Member
Premium Member
Posts: 93
Joined: Mon Oct 22, 2007 12:55 pm
Location: Albany
Contact:

Post by Nicole »

Ok, I got it. And the job worked. Thanks, you saved me. People have been bothering me for this job all week.

Sorry for all of the questions on this one.
just4geeks wrote:I don't have DataStage parallel in front of me but I remember there is an option in lookup stage where you can define if you want to match one or all. Go to Lookup Stage properties and select ALL option.

Thanks
Post Reply