Performance Tuning.

mprashant · Post by **mprashant** » Mon Jul 26, 2004 6:05 pm

Hello
I have a huge job that has performance issues. It takes 1 hr 25 min to move data across for 2100 rows! The size of the data is not too huge however there are 2 transformers directly connected with over 20 look ups each to universe stages. The resaon for universe stages is beacuse the has files do not allow me to look up for values between a range.
Is there a way I can increase performance of this job ? the previous consultant used the universe stage? Is therea benefit using Universe stages as opposed to say a DRS stage?

chulett · Post by **chulett** » Mon Jul 26, 2004 6:12 pm

Check this post for a very interesting way to do what it sounds like you are doing with UV stages in a hash stage instead. Takes a little bit to setup but very speedy stuff.

Or you may just need to build indexes for your hashed key columns... are they indexed?

rasi · Post by **rasi** » Mon Jul 26, 2004 9:42 pm

Try spiltting one job which does 20 lookups into many jobs with few lookups on each job. This should help.

Cheers
Rasi

mprashant · Post by **mprashant** » Tue Jul 27, 2004 7:55 pm

Now the look ups were against Universe databases and I changed all that to the DRS stages and it now runs in 5 min. I split the look-ups over a couple of more transformers, turned on the inter-process buffering, increased the read write cache size for the Hashed Files to 999 MB, tried playing around with a variety of Array sizes and have managed to get it to 4 min.
I was wondering if there was anything else I could do to increase the performance? I am still not happy with the 4 min and i think it takes that much time because it has to go over the network for each row for all those look-ups with the DRS stages.The reason we are using DRS stages instead of hash files is we are checking for contraints between two values