Performance improvement for Lookup

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
rony_daniel
Participant
Posts: 36
Joined: Thu Sep 01, 2005 5:44 am
Location: Canada

Performance improvement for Lookup

Post by rony_daniel »

Hi All,

I have a set of 10 jobs which are called by a sequence one after the other. The last 3 jobs are taking almost 75% of the total time required to run the sequence. This is because these jobs does the lookup with db2 tables which are huge in volume. To improve the performance of the sequence, I am thinking of doing the db2 query and storing the result set in a temporary file and use this in the last 3 jobs. For this I will create new jobs which will just query the db2 tables and create the lookup files. These jobs will start along with the other intial jobs in the sequence and by the time the last 3 jobs are ready for run the lookup is already created.

My query here is which file stage should I use for this. Dataset or Fileset or Lookup File set; so that the performance will be maximum. Right now I using a join stage for doing the lookups. Also will there be a substantial improvement in performance if I do the same?

Please share your expereinces.
Thanks & Regards,
Rony
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Data Set without a doubt. The other two require export/import or index-building operations. The operator to write/read a Data Set is copy (which tells you that it's very low cost - the virtual Data Set associated with the link is simply copied to/from the Data Set's data files on disk).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply