How to retrieve multiple rows using lookup table
Moderators: chulett, rschirm, roy
How to retrieve multiple rows using lookup table
I need to retrieve multiple rows from an Oracle master database using records stored in another (lookup) Oracle database. Essentially, each record in the lookup table will have multiple corresponding rows in the master table. All rows retrieved from the master table need to be then stored in another Oracle table.
I would greatly appreciate if anyone could point me in the right direction.
Thank you kindly
Thomas
I would greatly appreciate if anyone could point me in the right direction.
Thank you kindly
Thomas
Re: How to retrieve multiple rows using lookup table
You can use oracle plugin as a lookup directly. Another way is I think you need to create a database link(from the master database) to the other database and use oracle plugin to access the table from master database. Datbase link is needed if the lookup table in in another database.
Hope that helps.
Regards,
Manish
Hope that helps.
Regards,
Manish
tkozlows wrote:I need to retrieve multiple rows from an Oracle master database using records stored in another (lookup) Oracle database. Essentially, each record in the lookup table will have multiple corresponding rows in the master table. All rows retrieved from the master table need to be then stored in another Oracle table.
I would greatly appreciate if anyone could point me in the right direction.
Thank you kindly
Thomas
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Maybe we can help point you in a direction that better utilizies the features in DataStage. Can we get an idea of row counts in the source and reference tables? There are alternatives, such as high-performance hash files that provide reference capabilities, and maybe just approaching your design dilemma with a different perspective can change your solution.
Can you give specific table names and tell us what you need to achieve? Multi-row lookups are usually found in either slowly-changing dimension type lookups, such as ranged queries using a BETWEEN statement, or in situations where you can't explicitly identify the foreign key relationship. Otherwise, folks tend to think of the "master" as the driving table, instead of the "detail" as the driving table. Simply turning that around achieves what people want, plus enables all of the high-performance capabilities in DataStage.
Can you give specific table names and tell us what you need to achieve? Multi-row lookups are usually found in either slowly-changing dimension type lookups, such as ranged queries using a BETWEEN statement, or in situations where you can't explicitly identify the foreign key relationship. Otherwise, folks tend to think of the "master" as the driving table, instead of the "detail" as the driving table. Simply turning that around achieves what people want, plus enables all of the high-performance capabilities in DataStage.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
The item master table has a product foreign key? Are you saying that you wish to select from the item master based on the product keys in the lookup table? Would pseudo-sql look something like:
If this is the case, your job design choice of ODBC or OCI lookups based on your row counts will be extremely slow. Are the two tables on separate instances, or co-located?
Code: Select all
select * from item_master where product key in (select product_key from product_lookup)
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
The problem with using an OCI/ODBC lookup is that it will take forever, basically all 70 million rows in the item master table will generate a SQL query against the product table. You do the math, that's 70 million queries. All 70 million rows have to exit the item master instance, traverse the network, enter a DS transformer, generate the query, wait for the response, and then check the result. Now do all of that 70 million times, only allowing the 3-4 million rows you're interested in pass thru.
Is there any chance you can create a work table within the item master instance, say in a user work shema, and pull the product keys you're interested in from the product lookup table and put there. Then, doing a constraining inner-join, only pull the rows you want? This is BY FAR the best solution.
If you cannot get a work table, your choices become very limited. You will be required to pull all 70 million rows from the item master across to the DataStage server, and throw away the ones you don't need. Instead of an OCI/ODBC lookup, simply build a hash lookup just containing the product keys you want, and constrain against that reference lookup. You'll find that a lot faster because you only hit the network and db once to pull the product keys from the lookup table into the hash file. You can then use hash file caching to accelerate performance.
You'll want to instantiate the job extracting the data from the item master table, using some sort of data partitioning scheme on the WHERE clause to create multiple connections to the database. Use multiple job instances to handle this, so that you get the 70 million rows out as fast as possible.
Good luck.
Is there any chance you can create a work table within the item master instance, say in a user work shema, and pull the product keys you're interested in from the product lookup table and put there. Then, doing a constraining inner-join, only pull the rows you want? This is BY FAR the best solution.
If you cannot get a work table, your choices become very limited. You will be required to pull all 70 million rows from the item master across to the DataStage server, and throw away the ones you don't need. Instead of an OCI/ODBC lookup, simply build a hash lookup just containing the product keys you want, and constrain against that reference lookup. You'll find that a lot faster because you only hit the network and db once to pull the product keys from the lookup table into the hash file. You can then use hash file caching to accelerate performance.
You'll want to instantiate the job extracting the data from the item master table, using some sort of data partitioning scheme on the WHERE clause to create multiple connections to the database. Use multiple job instances to handle this, so that you get the 70 million rows out as fast as possible.
Good luck.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle