Lookup Range

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
shiv_nm
Participant
Posts: 13
Joined: Thu Sep 09, 2004 11:23 pm
Location: Bangalore
Contact:

Lookup Range

Post by shiv_nm »

Hi,
Can anybody help me on this pleas:

I have a lookup file like this:

From Till TYPE
2000000 2009999 SAP
3300000 3309999 Transport
3600000 3699999 Transport
4600000 4699999 Transport

I will have a Source File like this:

Vendor No Type
2000001
2000050
3600002

Now in transformer i need to look up for the file and compare Vendor no in between which range it falls, and then assign the TYPE collumn with TYPE column of Source into the target.

Waiting for the help:)
Regards
Shiv
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
server post in parallel forum? I hope it's not one you already posted in the server forum!

use search it was covered comprehensivle before (even I had a say regarding this in 1 occasion at least)

Good Luck,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Hello Shiv,

have you tried to use the "search" functionality using "lookup range"? There are a number of threads that directly and indirectly apply to the kind of lookup that you are trying to do.

As is usual in DS, there are numerous ways of solving this type of lookup; none of which will work efficiently for all users.

If your dataset for the "Type" lookups isn't huge, then I would most likely write a DS-Job that takes this file and creates a hash file with one record for each entry in the range, with the key being the actual "Vendor" and the value being "Type". Then I would use this file as a hash memory file lookup in the main job. This is just one way to approach the problem, it creates a file overhead but is quite efficient for large volumes on Vendor source file. I am sure that others will post other solutions...
shiv_nm
Participant
Posts: 13
Joined: Thu Sep 09, 2004 11:23 pm
Location: Bangalore
Contact:

Post by shiv_nm »

HI,
Thanks for your kind Reply.
my problem is when i have a look file with these two columns and source file with one Vendor no, i have to do comparison that in between which range it falls, if it falls in between 2000000 2009999
then i have to assign the TYPE field as SAP.

Now my question is when u double click on transformer you will get two windows right with target details and left with source and look up details.
Left side i need to map source key field to lookup file key field right?
If at all i map to two fields there can i put any condition there? to compare in between two fields?

Do help me
Shiv



ArndW wrote:Hello Shiv,

have you tried to use the "search" functionality using "lookup range"? There are a number of threads that directly and indirectly apply to the kind of lookup that you are trying to do.

As is usual in DS, there are numerous ways of solving this type of lookup; none of which will work efficiently for all users.

If your dataset for the "Type" lookups isn't huge, then I would most likely write a DS-Job that takes this file and creates a hash file with one record for each entry in the range, with the key being the actual "Vendor" and the value being "Type". Then I would use this file as a hash memory file lookup in the main job. This is just one way to approach the problem, it creates a file overhead but is quite efficient for large volumes on Vendor source file. I am sure that others will post other solutions...
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Hello Shiv,

perhaps I wasn't quite clear enough in the first post. The LOOKUP part of a transform will not let you do a range. That is why I suggested you create an interim file which contains each and every possible value from your lookup file contents, and then the source file will not need to do a range lookup, but can lookup on the key directly.
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
try
viewtopic.php?t=84998
also in short you have several options:
1. if the number of ranges is small it might be done in a simple rouitne using a case/switch.
2. if the number of rows in your source is small you might use a direct DB lookup (every row is another sql sent to the DB!!!)
3. load a unique list of source values to temp table then unload a join from htat table and the ranges table using SQL logic (between and such) to unload a hash file/ lookup file with the value in the source data as key column and the actual values deriven from the range table as data.
once you have this you can simply perform a direct lookup (each unload file is specificly built to match the source data for that run)
4. use the universe stage to run sql on the hash file (slow performance)

all of the above was covered in hte past but after 10 minutes of searching I found only 2 relevant posts in the mass that came out I felt a more then use search was in order ;)

IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
DBI
Participant
Posts: 3
Joined: Thu Jul 31, 2003 8:17 am

Re: Lookup Range

Post by DBI »

Hi Shiv,

if it's possible for you to make these entries in a database, then you could use the ODBC Stage for the reference Link.

In the ODBC Stage you are able to define a reference link SQL statement. So you make it in the way " where field FROM >= '' and field TILL <= '' "

So you'll get the Information from the range.

Dirk
shiv_nm wrote:Hi,
Can anybody help me on this pleas:

I have a lookup file like this:

From Till TYPE
2000000 2009999 SAP
3300000 3309999 Transport
3600000 3699999 Transport
4600000 4699999 Transport

I will have a Source File like this:

Vendor No Type
2000001
2000050
3600002

Now in transformer i need to look up for the file and compare Vendor no in between which range it falls, and then assign the TYPE collumn with TYPE column of Source into the target.

Waiting for the help:)
Regards
Shiv
varshanswamy
Participant
Posts: 48
Joined: Thu Mar 11, 2004 10:32 pm

Re: Lookup Range

Post by varshanswamy »

Shiv I had a similar problem and I have dealt it as follows, if your job is a server job, here is the solution.
The contents to be looked up (reference data) put it into a hash file, make from and till as the keys. In another job access this hash file using a universe stage as the universe stage allows you to query on the hash file just like a database stage to do the range lookup. If you still have doubts I shall clarify it, send your doubts across as my probelm is very similar to yours which I handled.
If your job is a parallel one, then the solution is using a column generator genrate a dummy column in both source and reference and then join them using the join stage, later you could use a filter stage to do the range checking that VENDOR NUMBER > FROM and VENDORNUMBER > TO.

If your inputs was from database i would suggest you to use sparse lookup and write the query in the database stage.

In my case have performance issue too, which I am working on till date
Here is my probelm which is similar to your problem

I have 2 database stages
First database stage contains a number field called IP_NUMBER.

Second database stage contains fields called

IP_FROM
IP_TO
COUNTRY_NAME
COUNTRY_CODE
DOMAIN_NAME

I have done a sparse lookup and a query in the database as follows
that is to obtain COUNTRY_CDOE, COUNTRY_NAME, DOMAIN_NAME based on the IP_NUMBER.

SELECT
COUNTRY_CODE,
COUNTRY_NAME,
DOMAIN_NAME
FROM
IP_LOOKUP
WHERE
ORCHESTRATE.IP_NUMBER
BETWEEN
IP_FROM
AND
IP_TO

The problem is since it is a sparse lookup, it is taking a very long time. Even though the table has been indexed, it is executing at the rate of one record per second, there are 1 million records, to be processed, so it is really very slow. Please suggest on this.
T42
Participant
Posts: 499
Joined: Thu Nov 11, 2004 6:45 pm

Post by T42 »

Ranged Lookup is not yet supported in EE. It is a feature coming very soon now (or probably already in 7.5.1, need to double check that.)

You will need to use a buildop if you're using a EE solution. Routines are also a good alternative if you're using Server.
sudharsanan
Participant
Posts: 64
Joined: Fri Jul 16, 2004 7:53 am

Post by sudharsanan »

Hi,
I am facing a similar kind of issues..as you have said in the post
"The contents to be looked up (reference data) put it into a hash file, make from and till as the keys. In another job access this hash file using a universe stage as the universe stage allows you to query on the hash file just like a database stage to do the range lookup. If you still have doubts I shall clarify it, send your doubts across as my probelm is very similar to yours which I handled. "

when i tried to access the UV Database for my hashfile, i getting a table not found exception. Can someone tell me how to get the Universe Database table name (Equivalent to my hashfile)?.. i am new to Datastage,i haven't used UV stages till now?.. so guide me in resolving this problem...

Thanks
richdhan
Premium Member
Premium Member
Posts: 364
Joined: Thu Feb 12, 2004 12:24 am

Post by richdhan »

Hi Sudarshan,

I think you have created the hash file using directory option and that is the reason you are getting table not found exception.

You can use the account option for creating hash file in which case you will be able to access the same hash file using UV stage. If you still want to persist with directory option then search for setfile command. This makes an entry of the hash file in VOC.

Once the hash file entry is available in VOC you can access the same using UV stage.

HTH
Rich
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Search the forum for SETFILE. This is a command for creating a pointer in the VOC file. Chuck will chime in soon with a plug for his utility CreateVocPtr.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chucksmith
Premium Member
Premium Member
Posts: 385
Joined: Wed Jun 16, 2004 12:43 pm
Location: Virginia, USA
Contact:

Post by chucksmith »

My job here is done. Beam me up Scotty. :D
Post Reply