Performance on large look-up table

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Chandrathdsx
Participant
Posts: 59
Joined: Sat Jul 05, 2008 11:32 am

Performance on large look-up table

Post by Chandrathdsx »

I have a source data volume of one million on Oracle Table and I have a dimension table to be looked up with a volume of 80 million records to get the DWID from the look-up table. I am using the join to do this on a 8 node configuration. It is taking approximately 15mins to process the data and load in to target table. Gurus, could you suggest me am I doing the right thing? Is there any other better way to do it to improve the performance further?

Thank you.
Vikas Jain
Participant
Posts: 15
Joined: Tue Dec 13, 2005 12:38 am

Re: Performance on large look-up table

Post by Vikas Jain »

Hi,

I would say you are doing right by using a Join stage instead of lookup since your reference dimension table also is quite huge.
Additionally, you can sort the data prior to join stage on the key fields.
Also, while the job is executing, look for the server resource utilization as in scratch space available, # of processes etc. and if you find anything alarming, then either change the Environment variables in DSAdmin to appropriate values or else server configurations like maxuprocess etc..

~Vikas~
Post Reply