which is better - join query or odbc lookup ?

raj_cipher · Post by **raj_cipher** » Mon Feb 09, 2004 6:36 am

Hi,

I hv to select records from various tables.
I can use SQL Join query or use look-up to acheive this.
The volume of the records is in millions.
The count of the tables i'm going to use will be around seven.

Which of the following is a better option ?

Using a join query or using ODBC look-ups.

chulett · Post by **chulett** » Mon Feb 09, 2004 7:35 am

I don't think any DataStage question can be answered with "it's better to use ODBC lookups".

Your question should probably be comparing Joins with HASH lookups.

My general Rule of Thumb is to say - if it can be done in the database, do it in the database. I'm curious what others think, but if I can make Oracle do the work... I let it! Especially something like avoiding a lookup when a simple join in the sourcing query can take its place.

essaion · Post by **essaion** » Mon Feb 09, 2004 9:03 am

I totally agree with Craig (for what is worth :)

nsm · Post by **nsm** » Mon Feb 09, 2004 9:08 am

Its better to go for Join Query..

nsm.

ray.wurlod · Post by **ray.wurlod** » Mon Feb 09, 2004 7:33 pm

I think it's worth clarifying the reasons for the unanimity.

First, there's nothing technically wrong with using the ODBC stage to perform reference lookups. It will work. But it will work very slowly, leading probably to unacceptable throughput.

Pre-loading hashed files and performing the reference lookups therein is a much more performant solution than ODBC; it doesn't take long to load them with a simple SELECT from the source database.

However, you then have to process every row from the driver table.

The advantage of performing the join at source is that you already have exactly the rows and columns that you need and - here's the crux of the argument - no redundant rows for DataStage to process.

To summarise.

When the tables to be joined are in the same database instance, perform the join at source.
When the tables are in different database instances, pre-load hashed files and perform reference lookups to them.