Page 1 of 1

Lookup Range

Posted: Wed Feb 23, 2005 4:12 am
by shiv_nm
Hi,
Can anybody help me on this pleas:

I have a lookup file like this:

From Till TYPE
2000000 2009999 SAP
3300000 3309999 Transport
3600000 3699999 Transport
4600000 4699999 Transport

I will have a Source File like this:

Vendor No Type
2000001
2000050
3600002

Now in transformer i need to look up for the file and compare Vendor no in between which range it falls, and then assign the TYPE collumn with TYPE column of Source into the target.

Waiting for the help:)
Regards
Shiv

Posted: Wed Feb 23, 2005 4:16 am
by roy
Hi,
server post in parallel forum? I hope it's not one you already posted in the server forum!

use search it was covered comprehensivle before (even I had a say regarding this in 1 occasion at least)

Good Luck,

Posted: Wed Feb 23, 2005 4:21 am
by ArndW
Hello Shiv,

have you tried to use the "search" functionality using "lookup range"? There are a number of threads that directly and indirectly apply to the kind of lookup that you are trying to do.

As is usual in DS, there are numerous ways of solving this type of lookup; none of which will work efficiently for all users.

If your dataset for the "Type" lookups isn't huge, then I would most likely write a DS-Job that takes this file and creates a hash file with one record for each entry in the range, with the key being the actual "Vendor" and the value being "Type". Then I would use this file as a hash memory file lookup in the main job. This is just one way to approach the problem, it creates a file overhead but is quite efficient for large volumes on Vendor source file. I am sure that others will post other solutions...

Posted: Wed Feb 23, 2005 4:27 am
by shiv_nm
HI,
Thanks for your kind Reply.
my problem is when i have a look file with these two columns and source file with one Vendor no, i have to do comparison that in between which range it falls, if it falls in between 2000000 2009999
then i have to assign the TYPE field as SAP.

Now my question is when u double click on transformer you will get two windows right with target details and left with source and look up details.
Left side i need to map source key field to lookup file key field right?
If at all i map to two fields there can i put any condition there? to compare in between two fields?

Do help me
Shiv



ArndW wrote:Hello Shiv,

have you tried to use the "search" functionality using "lookup range"? There are a number of threads that directly and indirectly apply to the kind of lookup that you are trying to do.

As is usual in DS, there are numerous ways of solving this type of lookup; none of which will work efficiently for all users.

If your dataset for the "Type" lookups isn't huge, then I would most likely write a DS-Job that takes this file and creates a hash file with one record for each entry in the range, with the key being the actual "Vendor" and the value being "Type". Then I would use this file as a hash memory file lookup in the main job. This is just one way to approach the problem, it creates a file overhead but is quite efficient for large volumes on Vendor source file. I am sure that others will post other solutions...

Posted: Wed Feb 23, 2005 4:43 am
by ArndW
Hello Shiv,

perhaps I wasn't quite clear enough in the first post. The LOOKUP part of a transform will not let you do a range. That is why I suggested you create an interim file which contains each and every possible value from your lookup file contents, and then the source file will not need to do a range lookup, but can lookup on the key directly.

Posted: Wed Feb 23, 2005 4:49 am
by roy
Hi,
try
viewtopic.php?t=84998
also in short you have several options:
1. if the number of ranges is small it might be done in a simple rouitne using a case/switch.
2. if the number of rows in your source is small you might use a direct DB lookup (every row is another sql sent to the DB!!!)
3. load a unique list of source values to temp table then unload a join from htat table and the ranges table using SQL logic (between and such) to unload a hash file/ lookup file with the value in the source data as key column and the actual values deriven from the range table as data.
once you have this you can simply perform a direct lookup (each unload file is specificly built to match the source data for that run)
4. use the universe stage to run sql on the hash file (slow performance)

all of the above was covered in hte past but after 10 minutes of searching I found only 2 relevant posts in the mass that came out I felt a more then use search was in order ;)

IHTH,

Re: Lookup Range

Posted: Wed Feb 23, 2005 4:52 am
by DBI
Hi Shiv,

if it's possible for you to make these entries in a database, then you could use the ODBC Stage for the reference Link.

In the ODBC Stage you are able to define a reference link SQL statement. So you make it in the way " where field FROM >= '' and field TILL <= '' "

So you'll get the Information from the range.

Dirk
shiv_nm wrote:Hi,
Can anybody help me on this pleas:

I have a lookup file like this:

From Till TYPE
2000000 2009999 SAP
3300000 3309999 Transport
3600000 3699999 Transport
4600000 4699999 Transport

I will have a Source File like this:

Vendor No Type
2000001
2000050
3600002

Now in transformer i need to look up for the file and compare Vendor no in between which range it falls, and then assign the TYPE collumn with TYPE column of Source into the target.

Waiting for the help:)
Regards
Shiv

Re: Lookup Range

Posted: Wed Feb 23, 2005 9:24 am
by varshanswamy
Shiv I had a similar problem and I have dealt it as follows, if your job is a server job, here is the solution.
The contents to be looked up (reference data) put it into a hash file, make from and till as the keys. In another job access this hash file using a universe stage as the universe stage allows you to query on the hash file just like a database stage to do the range lookup. If you still have doubts I shall clarify it, send your doubts across as my probelm is very similar to yours which I handled.
If your job is a parallel one, then the solution is using a column generator genrate a dummy column in both source and reference and then join them using the join stage, later you could use a filter stage to do the range checking that VENDOR NUMBER > FROM and VENDORNUMBER > TO.

If your inputs was from database i would suggest you to use sparse lookup and write the query in the database stage.

In my case have performance issue too, which I am working on till date
Here is my probelm which is similar to your problem

I have 2 database stages
First database stage contains a number field called IP_NUMBER.

Second database stage contains fields called

IP_FROM
IP_TO
COUNTRY_NAME
COUNTRY_CODE
DOMAIN_NAME

I have done a sparse lookup and a query in the database as follows
that is to obtain COUNTRY_CDOE, COUNTRY_NAME, DOMAIN_NAME based on the IP_NUMBER.

SELECT
COUNTRY_CODE,
COUNTRY_NAME,
DOMAIN_NAME
FROM
IP_LOOKUP
WHERE
ORCHESTRATE.IP_NUMBER
BETWEEN
IP_FROM
AND
IP_TO

The problem is since it is a sparse lookup, it is taking a very long time. Even though the table has been indexed, it is executing at the rate of one record per second, there are 1 million records, to be processed, so it is really very slow. Please suggest on this.

Posted: Tue Mar 01, 2005 2:46 pm
by T42
Ranged Lookup is not yet supported in EE. It is a feature coming very soon now (or probably already in 7.5.1, need to double check that.)

You will need to use a buildop if you're using a EE solution. Routines are also a good alternative if you're using Server.

Posted: Wed Mar 02, 2005 3:55 am
by sudharsanan
Hi,
I am facing a similar kind of issues..as you have said in the post
"The contents to be looked up (reference data) put it into a hash file, make from and till as the keys. In another job access this hash file using a universe stage as the universe stage allows you to query on the hash file just like a database stage to do the range lookup. If you still have doubts I shall clarify it, send your doubts across as my probelm is very similar to yours which I handled. "

when i tried to access the UV Database for my hashfile, i getting a table not found exception. Can someone tell me how to get the Universe Database table name (Equivalent to my hashfile)?.. i am new to Datastage,i haven't used UV stages till now?.. so guide me in resolving this problem...

Thanks

Posted: Wed Mar 02, 2005 6:46 am
by richdhan
Hi Sudarshan,

I think you have created the hash file using directory option and that is the reason you are getting table not found exception.

You can use the account option for creating hash file in which case you will be able to access the same hash file using UV stage. If you still want to persist with directory option then search for setfile command. This makes an entry of the hash file in VOC.

Once the hash file entry is available in VOC you can access the same using UV stage.

HTH
Rich

Posted: Wed Mar 02, 2005 4:02 pm
by ray.wurlod
Search the forum for SETFILE. This is a command for creating a pointer in the VOC file. Chuck will chime in soon with a plug for his utility CreateVocPtr.

Posted: Thu Mar 03, 2005 8:05 am
by chucksmith
My job here is done. Beam me up Scotty. :D