Performance Issue

rasi · Post by **rasi** » Tue Feb 07, 2006 6:58 pm

Break 25 lookups into small jobs is more efficient compared to having one single monster job

ray.wurlod · Post by **ray.wurlod** » Wed Feb 08, 2006 12:14 am

One job, but with multiple Transformer stages (say, not more than four lookups per Transformer stage) would also be OK. You can enable inter-process row buffering (and, if desired, interpolate IPC stages to make it obvious that there are separate processes involved). Of course, splitting into multiple processes is not a great gain if you only have a single CPU.

rajkraj · Post by **rajkraj** » Wed Feb 08, 2006 8:57 am

Thanks for your responses. Ray i have a question
which one is the best one
In Single job with multiple Tranasformer stages or spliting the big job into multile jobs

ray.wurlod wrote:One job, but with multiple Transformer stages (say, not more than four lookups per Transformer stage) would also be OK. You can enable inter-process row buffering (and, if desired, interpolate IPC stages to make it obvious that there are separate processes involved). Of course, splitting into multiple processes is not a great gain if you only have a single CPU.

ray.wurlod · Post by **ray.wurlod** » Wed Feb 08, 2006 10:58 pm

"Best" is too subjective a term. 25 jobs is very many to maintain. 25 separate jobs will make troubleshooting easier. What/where are your priorities?

aartlett · Post by **aartlett** » Thu Feb 09, 2006 4:18 pm

I have to agree with Ray, One job, multiple transformations.
Load as many Hashed tables into RAM (enable caching) as possible.

If volumes are huge you can even containerise the lookup transforms, split the stream using a link partitioner, process each stream through the lookups and then use a link collector to pring it all together.

This is not required if the hashed files load to ram. Use Administrator to increase the cache buffer size.

I like to hammer the cpu out of a box, remember a lost cpu cycle is a waster cpu cycle. Try to keep the box at about 5% idle.

Ocean · Post by **Ocean** » Thu Sep 28, 2006 3:15 am

Hi Ray,

I created a job with four transformers, one having 5~7 lookup. When using inter process row buffer, overall performance is like 50 records/sec. When not using inter process row option, it runs like 250 rows/sec.

Is there any issue with this? Any suggestion?

Thanks,

ArndW · Post by **ArndW** » Thu Sep 28, 2006 3:50 am

Ocean - if you address a post to Ray, does that mean you don't want to hear from anyone else? The performance change is expected when you have a multi cpu system. What is your question? Oops, I'll take that back since I'm not Ray.

ray.wurlod · Post by **ray.wurlod** » Thu Sep 28, 2006 5:47 am

How many rows? Rows/sec is an almost meaningless metric for a whole lot of reasons. For example, the start-up time and close-down time are counted in the elapsed time, even though no rows are processed. For a small number of rows, you might also be waiting for the (IPC) buffers to fill.

Ocean · Post by **Ocean** » Thu Sep 28, 2006 11:12 pm

Hi Ray,

Not only row/sec figure, the elapse time is also taking longer than no inter process row option.

Hi ArndW,

Just happened addressing to ray on reading his advice. I really appreciage your help.

Development server has 1 processor, production has 4, so expected to have better performance in production. Can it be concluded processor issue here?

Thanks all for advice,