Any other option to minimise look ups

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
raj_cipher
Participant
Posts: 90
Joined: Mon Dec 08, 2003 4:48 am
Location: Chennai

Any other option to minimise look ups

Post by raj_cipher »

Hi ,

I got the info that using more no. of look ups affects perfomance(Speed).
In my case,i am using odbc stage for look ups.The records that i look up is in millions.The no. of look ups is also around 15.Is there any way to minimise look ups ?
Think Ahead,
Raj.D
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Pre-load the lookup tables to memory. You can't do this with ODBC; you will need to use hashed files.
Pre-allocate disk for the hashed files (use MINIMUM.MODULUS option, to get a value use the Hashed File Calculator on your DataStage CD).
Loading the hashed files won't take long (minutes). Plus you need to design some simple jobs (15, was it?) to accomplish same.
Memory access speed is AT LEAST 1000 TIMES faster than disk access, which is what you need for ODBC.
It's definitely worth doing.
All aspects of this have been covered in the past on this forum, using its search facility will help you to fill in the gaps of my brief overview.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jseclen
Participant
Posts: 133
Joined: Wed Mar 05, 2003 4:19 pm
Location: Lima - Peru. Sudamerica
Contact:

Re: Any other option to minimise look ups

Post by jseclen »

Hi,

Another way is create a query that contain almost lookups that you need.

>For example:

select cod_cli, cod_cta, cod_tip from client_table

select cod_tip, des_tip from cat_tip

in your query join both tables:
select a.cod_cli, a.cod_cta, a.cod_tip, b.des_tip
from client_table a, cat_tip b
where a.cod_tip = b.cod_tip

with this query you don't need lookups in DataStage. :lol:
Saludos,

Miguel Seclén
Lima - Peru
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Re: Any other option to minimise look ups

Post by ogmios »

My 2 cents would be to try to get your lookups included in the input query (calculating the data in the database not via DataStage). I only use database lookups or hashfiles for very large dimensions that are in other databases as the input-query.

In 75% of the cases a properly tuned database will outrun DataStage anytime for these kind of lookups... try it with a testcase.

My experience is also that although hash-files are of course faster than ODBC lookups having a lot of big hash-files on your system tends to trash the memory slowing down the complete system (Solaris/Datastage 6.0.x/2Gb Memory).

Use the database... one if its main purposes is doing these kind of joins (left outer join the tables if you want to mimic lookups).
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

If you are building a data warehouse in batch mode:

Typically, you have surrogate keys. Therefore, you have a common task of switching natural keys for their warehouse assigned surrogate. This requires source data to reference target tables for surrogate foreign key substitution.

Since this is done over and over and over and over in soooo many jobs, doing the same subsitution for the same dimension soooo many times during a single load cycle (production dimensions, time dimensions, customer dimensions, code dimensions, etc) are in a lot of tables. This repititous lookup is burdensome on a database.

This is why you can prepare hash lookups, using all of the functionality meant to eliminate the burden of repititous and shared lookup. I populate terabyte data warehouses using just plain old Server. We know what we're talking about. It's a balance of functionality and performance with maintenance.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Post Reply