Lookup Performance

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
sreedhards
Participant
Posts: 4
Joined: Tue Jan 15, 2013 7:21 am
Location: Bangalore

Lookup Performance

Post by sreedhards »

Hi,
I'm using Lookup stage in my job,there are 10 lookup tables and one primary table.Please suggest any tips for better performance on higher volumes.

Thanks,
Sreedhar S.
Cheers,
Forrest.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

The number of lookups you perform isn't as important as the type of data source behind the lookups (is it a dataset? A database table?) and, more importantly, how big the data volumes are in the lookups themselves. Without knowing the size of the reference data (in MB, average record length times number of records) no suggestions or recommendation can be made.
rkashyap
Premium Member
Premium Member
Posts: 532
Joined: Fri Dec 02, 2011 12:02 pm
Location: Richmond VA

Re: Lookup Performance

Post by rkashyap »

What is the issue that you are facing ... What is the volume of incoming records vs lookup table size? How long is the execution time?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The answer will depend on many factors, including what other load there is on the server(s).

For example you could use 10 Lookup stages and disable operator combination, so that the individual lookups are being performed in separate processes. If not much else is happening on the machine, this would tend to be quite quick, particularly if you have a large volume of data and relatively small reference data sets.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

If your lookups have small number of rows then it works fine. Your inbound data stream can be huge without issue. If your lookup table size grows then it becomes necessary to use other stage types no matter how large the inbound stream. You can use sparse lookups with small inbound stream and large lookup tables. Should have indexes on the keys in a sparse lookup.
Mamu Kim
Post Reply