Performance improvement for Lookup

rony_daniel · Post by **rony_daniel** » Fri May 30, 2008 10:55 am

Hi All,

I have a set of 10 jobs which are called by a sequence one after the other. The last 3 jobs are taking almost 75% of the total time required to run the sequence. This is because these jobs does the lookup with db2 tables which are huge in volume. To improve the performance of the sequence, I am thinking of doing the db2 query and storing the result set in a temporary file and use this in the last 3 jobs. For this I will create new jobs which will just query the db2 tables and create the lookup files. These jobs will start along with the other intial jobs in the sequence and by the time the last 3 jobs are ready for run the lookup is already created.

My query here is which file stage should I use for this. Dataset or Fileset or Lookup File set; so that the performance will be maximum. Right now I using a join stage for doing the lookups. Also will there be a substantial improvement in performance if I do the same?

Please share your expereinces.

ray.wurlod · Post by **ray.wurlod** » Fri May 30, 2008 4:02 pm

Data Set without a doubt. The other two require export/import or index-building operations. The operator to write/read a Data Set is copy (which tells you that it's very low cost - the virtual Data Set associated with the link is simply copied to/from the Data Set's data files on disk).