Sparce lookup

deployDS · Post by **deployDS** » Thu Aug 17, 2006 12:49 pm

Hi,

I need to do a look up for a dataset that could have a maximum of 50 rows with a DB2 table that could a minimum of 30 million rows.
For each row in dataset, there could be more than one matches from database. I am presently able to accomplish this using Join stage. But it is taking a very long time to pull data from database and do the lookup. I'm using a DB2 API stage as I do not have an enterprise stage.

A collegue suggested that I use a sparce look up through ODBC enterprise stage. But I was wondering if there's a way to get more than one matches from database when using lookup stage. Can anyone please suggest me?

Thanks!

koolnitz · Post by **koolnitz** » Thu Aug 17, 2006 1:32 pm

1. You cannot use a Sparse lookup with DB2 API stage. See this post. I am not sure about ODBC.

2. A lookup can fetch multiple records from lookup table, if you check "Multiple rows returned from link" option in Constraints properties of Lookup stage.

deployDS · Post by **deployDS** » Thu Aug 17, 2006 1:47 pm

Thanks Nithin! I am not trying to use API for sparce lookup. I know I cannot. I was using Join stage with API to fetch records.

Since we don't have enterprise db2, I'm trying to do sparce lookup through ODBC Enterprise stage. And when I use ODBC enterprise and set it for sparce lookup, the "Multiple rows returned from link" becomes disabled.

Any other suggestions?

Krazykoolrohit · Post by **Krazykoolrohit** » Thu Aug 17, 2006 1:47 pm

changing join with lookup will not reduce time considerably.

I suggest that you break up the job into two. in one job just extract the table and perform join and other transformations in the second job.

Alternatively, extract from the table and write to a sequential file stage in the same job (introduce a sequential file between your API stage and the join stage.

deployDS · Post by **deployDS** » Thu Aug 17, 2006 1:59 pm

Thank you Rohit, but isn't 30million-50million too much of data to be extracted to a file, while all I need to do is find matches for about 60 rows?

"changing join with lookup will not reduce time considerably."
I've never worked with sparce lookup before, but won't sparce lookup reduce the time considerably in my case?

Thanks!

DSguru2B · Post by **DSguru2B** » Thu Aug 17, 2006 2:06 pm

Load the 60 records into a temp table and utilize the power of UDB by doing a join at the database level. That would be faster anyday, keeping in mind the network traffic that will be induced if 30M records are to be loaded to a file and then doing a lookup.

deployDS · Post by **deployDS** » Thu Aug 17, 2006 2:26 pm

Thanks DSguru2B, that's a useful suggestion, but I cannot implement it because my target tables reside on a different database from staging tables. I only have access to create tables at staging level, while my lookup table is a target table.

Is there nothing I can do within sparce lookup in ODBC enterprise stage?

DSguru2B · Post by **DSguru2B** » Thu Aug 17, 2006 2:45 pm

Why dont you build a lookup set from the 50 records. Have the DB2 table as your input and do a lookup. This way you will get all the records from the table that have a hit on the dataset. I am sure you must have thought of that but whats stopping you from using this method.
I am not very familiar with the ODBC enterprise stage so cannot help you in that.
I know in sql server there is a function which can be used to connect to a different database from one database. Dont know if any such thing exists for Db2. Do some research, i am sure it exists.

chulett · Post by **chulett** » Thu Aug 17, 2006 2:46 pm

deployDS wrote:Thanks DSguru2B, that's a useful suggestion, but I cannot implement it because my target tables reside on a different database from staging tables. I only have access to create tables at staging level, while my lookup table is a target table.

That really shouldn't be an issue. Can you not pull a list of the natural keys from your current staging tables and load them into a work table in the same db as the target tables? Then you can join them properly.

deployDS · Post by **deployDS** » Thu Aug 17, 2006 3:50 pm

My problem is solved. In fact, when using sparce lookup through ODBC enterprise stage, it outputs all the matches my default. I didn't know this and couldn't test till my driver was installed just now. And it is amazingly fast

Chulett, the problem is that I am not allowed to create any work tables in the db where target tables exist.

DSguru2B, yes, I did think of that, but was looking for ways to avoid calling all data from the table.

Thank you all for your responses n suggestions!