How to retrieve multiple rows using lookup table

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
tkozlows
Premium Member
Premium Member
Posts: 13
Joined: Tue Dec 03, 2002 7:31 pm
Location: Melbourne, Australia

How to retrieve multiple rows using lookup table

Post by tkozlows »

I need to retrieve multiple rows from an Oracle master database using records stored in another (lookup) Oracle database. Essentially, each record in the lookup table will have multiple corresponding rows in the master table. All rows retrieved from the master table need to be then stored in another Oracle table.

I would greatly appreciate if anyone could point me in the right direction.

Thank you kindly
Thomas
manishsk
Participant
Posts: 13
Joined: Mon Mar 14, 2005 9:37 pm

Re: How to retrieve multiple rows using lookup table

Post by manishsk »

You can use oracle plugin as a lookup directly. Another way is I think you need to create a database link(from the master database) to the other database and use oracle plugin to access the table from master database. Datbase link is needed if the lookup table in in another database.

Hope that helps.

Regards,
Manish
tkozlows wrote:I need to retrieve multiple rows from an Oracle master database using records stored in another (lookup) Oracle database. Essentially, each record in the lookup table will have multiple corresponding rows in the master table. All rows retrieved from the master table need to be then stored in another Oracle table.

I would greatly appreciate if anyone could point me in the right direction.

Thank you kindly
Thomas
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The ODBC stage type is equipped for multi-row returns from lookups.
Invoke help, and search for Multi.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
tkozlows
Premium Member
Premium Member
Posts: 13
Joined: Tue Dec 03, 2002 7:31 pm
Location: Melbourne, Australia

Post by tkozlows »

Thank you, but I do not have an ODBC connectivity between DS and the Oracle servers. :cry:
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

But you could if you really needed it. All you have to do is set it up, the drivers come with the product.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Maybe we can help point you in a direction that better utilizies the features in DataStage. Can we get an idea of row counts in the source and reference tables? There are alternatives, such as high-performance hash files that provide reference capabilities, and maybe just approaching your design dilemma with a different perspective can change your solution.

Can you give specific table names and tell us what you need to achieve? Multi-row lookups are usually found in either slowly-changing dimension type lookups, such as ranged queries using a BETWEEN statement, or in situations where you can't explicitly identify the foreign key relationship. Otherwise, folks tend to think of the "master" as the driving table, instead of the "detail" as the driving table. Simply turning that around achieves what people want, plus enables all of the high-performance capabilities in DataStage.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
tkozlows
Premium Member
Premium Member
Posts: 13
Joined: Tue Dec 03, 2002 7:31 pm
Location: Melbourne, Australia

Post by tkozlows »

The master table holds approx 70M item records. My lookup table has about 1M product records. Using the product records in the lookup table, I expect to extract approx 3-4M item records from the master table. I hope this helps.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

The item master table has a product foreign key? Are you saying that you wish to select from the item master based on the product keys in the lookup table? Would pseudo-sql look something like:

Code: Select all

select * from item_master where product key in (select product_key from product_lookup)
If this is the case, your job design choice of ODBC or OCI lookups based on your row counts will be extremely slow. Are the two tables on separate instances, or co-located?
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
tkozlows
Premium Member
Premium Member
Posts: 13
Joined: Tue Dec 03, 2002 7:31 pm
Location: Melbourne, Australia

Post by tkozlows »

That's right. The product_key in product_lookup is a subset of the product_key in item_master, hence there are multiple records in item_master for each record in product_lookup.

The two tables 'live' in separate database instance (and separate physical hosts).
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

The problem with using an OCI/ODBC lookup is that it will take forever, basically all 70 million rows in the item master table will generate a SQL query against the product table. You do the math, that's 70 million queries. All 70 million rows have to exit the item master instance, traverse the network, enter a DS transformer, generate the query, wait for the response, and then check the result. Now do all of that 70 million times, only allowing the 3-4 million rows you're interested in pass thru.

Is there any chance you can create a work table within the item master instance, say in a user work shema, and pull the product keys you're interested in from the product lookup table and put there. Then, doing a constraining inner-join, only pull the rows you want? This is BY FAR the best solution.

If you cannot get a work table, your choices become very limited. You will be required to pull all 70 million rows from the item master across to the DataStage server, and throw away the ones you don't need. Instead of an OCI/ODBC lookup, simply build a hash lookup just containing the product keys you want, and constrain against that reference lookup. You'll find that a lot faster because you only hit the network and db once to pull the product keys from the lookup table into the hash file. You can then use hash file caching to accelerate performance.

You'll want to instantiate the job extracting the data from the item master table, using some sort of data partitioning scheme on the WHERE clause to create multiple connections to the database. Use multiple job instances to handle this, so that you get the 70 million rows out as fast as possible.

Good luck.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
tkozlows
Premium Member
Premium Member
Posts: 13
Joined: Tue Dec 03, 2002 7:31 pm
Location: Melbourne, Australia

Post by tkozlows »

Great. Thank you very much for your help. :D
Post Reply