Page 1 of 2

Tuning DS server

Posted: Wed Aug 18, 2004 12:50 am
by asvictor
Hi

One of our client came with this request of Tuning their Jobs. When I checked the Job, they we quite ok, but the Performance was very much affected while running the Jobs. The DBA had informed that they have four CPUs, but I dont know whether Datastage is configured to use all the available CPUs. I have a doubt they will still be using one CPU which affects the Performance of the Jobs. How do I check whether DS uses all the Four CPUs? if it is not using, what should I do to make DS use all the available CPUs.

Do i need to do some setting in the Database? (DB2) or do some setting in Datastage server?

Can some one help me please?

Cheers,

Victor

Posted: Wed Aug 18, 2004 1:09 am
by richdhan
Hi Victor,

How was the job performing initially when it was delivered to the client.

When you run the job you can do performance statistics for active stages by selecting performance statistics option in the Tracing tab that comes in the Job Run options.

Check servjdev.pdf chapter Optimizing Performance in Server Jobs. It gives more info on the point discussed where you can find whether the job is performing poorly because it is CPU limited or I/O limited.

If it CPU limited try Turning ON Row Buffering for the job and run the job. It is available in Performance tab in Job properties window. See if that improves performance.

In Process Row Buffering enables the data to to be buffered rather than passing them by row. Interprocess row buffering enables active stages to run on a seperate process. You can also make use of IPC stage explicitly in the job to enable interprocess buffering.

HTH
--Rich

multi cpu configs

Posted: Wed Aug 18, 2004 7:23 am
by flashgordon
just wanted to add one other thing in addition to the excellent post above. unless you are using enterprise edition (parallel extender) one job can use at most one cpu. so unless you are running lot's of jobs at the same time or you have a lot of other non datastage stuff on the datastage server extra cpu's may or may not help the datastage job you are having performance problems with. One datastage job uses one pid and one pid can use at most one cpu.

... Flash Gordon

Posted: Wed Aug 18, 2004 8:07 am
by tonystark622
just wanted to add one other thing in addition to the excellent post above. unless you are using enterprise edition (parallel extender) one job can use at most one cpu.
In my experience, at least on HP-UX (Unix), this is not true. With InterProcess Row Buffering turned on each active stage can run on it's own process and these processes can be scattered across all the processors that DataStage can access on the box.

Tony

Posted: Wed Aug 18, 2004 3:29 pm
by ray.wurlod
It's not a statement you can generalize to say that a DataStage job will use only one CPU. For starters, the job runs in one process, an active stage will run in a child process. The likelihood is that they will get different CPUs. Further, independent active stages (for example separate processing streams) in the one job will all get separate processes. And I haven't even mentioned IPC stage or row buffering yet! Nor running multiple instances of server jobs!

I'd be looking at other things that slow throughput down. For example, are the logs purged regularly. Are there squillions of old files in the &PH& directory? Does the job execute the "query from hell" during extraction? Is loading blocked by other processes (for example backups, batch loads, etc.)?

Posted: Thu Aug 19, 2004 7:38 am
by aaronej
Ray,

What standard practice would you recommend for purging log files? What is your best practice for this?

Thanks!

Aaron

Posted: Thu Aug 19, 2004 7:53 am
by datastage
Inprocess and interprocess row buffering can definitely improve performance, but they can also hurt performance. I don't like the idea of using the project wide setting, but analyze and test on a job by job basis for complex jobs and jobs that need performance improving.

Also, don't forget to work with the DBAs to help tune big queries. Often doing simple selects and moving grouping to aggregator stages and joins to hashed file lookups will improve performance. Depending of your needs look into all the options on the database load side from changing how often commits are performed; analyze delete all rows vs. truncate table vs. drop and recreate tables; looking into dropping indexed prior to load and recreating after load; verify when insert new or update existing should be used versus update existing and insert new.

Back to DS, typically best performance is gained when hashed files have write cache turned on, are pre-loaded to memory for lookup, and you can even increase the minimum modulus for large hashed files to improve write performance.

Performance tuning in DS is endless...hope this helps, but as others mentioned before the first question to answer is whether you are more CPU bound or I/O bound and the effect of network traffic.

Purging of DataStage job logs

Posted: Thu Aug 19, 2004 3:37 pm
by ray.wurlod
Best practice is to have automatic log purging enabled, and to perform manual log purging when a large number of events is logged.

Automatic log purging is triggered when a job completes successfully. It is not triggered if the job fails to complete successfully.

You may need to propagate automatic settings to logs of existing jobs, if enabling it in Administrator for the first time.

Posted: Fri Aug 20, 2004 6:06 am
by aaronej
First, this is a really great and (for me) timley thread - thanks to everyone who has chimed in and provided great information. Second, what are the down-sides to interprocess row buffering? Why would you not want it turned on at a project level? I understand the concept of what this is doing, I'm just not sure I fully comprehend the downside to it.

Any help further understnading this would be great.

Thanks!

Aaron

Posted: Fri Aug 20, 2004 6:21 am
by chulett
The downsides? Mostly what I've seen is either the buggy way it was implemented or just the fact that it is the nature of the beast. :? This impression may also change from version to version. Heck, Ascential sent of a big alert yesterday for anyone using the 7.5 Enterprise Edition to download 7.5a ASAP because row buffering is horked. :wink:

I recall quite a number of posts here that start out: "I've got the weirdest problem" and end with "turn off row buffering". We've seen Aggregator problems, Oracle problems, all kinds of very interesting problems that magically go away when buffering is turned off. My own personal most recent favorite is a phantom "Unique Key Constraint Violation" caused by buffering.

In my philosophy, we keep it turned off at the Project level and only turn it on in those jobs that "need" it or will benefit from it - and only then after significant testing.

Posted: Fri Aug 20, 2004 7:04 am
by tonystark622
Row Buffering is horked in 7.1.0 also... Go ahead, ask me how I know... :lol:

Tony

Posted: Fri Aug 20, 2004 7:23 am
by chulett
Um... how do you know, Tony?

Posted: Fri Aug 20, 2004 7:33 am
by tonystark622
I've been fighting jobs that "hang" or abort with "X columns were expected, Y were found" messages for several weeks. Support folks suggested turning off Row Buffering on these jobs.

Guess what? There's also a problem if you disable Row Buffering sometimes it doesn't disable it. You know this when you reset the job and get "ipc_getnext()" and "timeout waiting for mutex" errors on jobs that hang or abort. Heh.

Then after you do get row buffering turned off, you're watching a job and think it's hanging because the row counts don't increment in real time in a section of cascaded transformers, they stay at zero. Just be patient. It will finish in good time. And when it does the row counts will be updated.

All good now, though. I did mention that I love my job, right? :lol:

Tony

Posted: Fri Aug 20, 2004 11:25 am
by rggoud
Hi,

We recently upgraded to DS 7.0.1 from DS 7.0. We also got the same errors (mutex error / Expecting n columns Y columns found). Thanks to Tony's suggestion we turned off row buffering, our jobs ran fine but it is taking long time to complete. Is there any fix so that our jobs will run fine using inter-process row buffering.

thanks.

Raj.

Posted: Fri Aug 20, 2004 11:40 am
by tonystark622
Aside from upgrading (and I'm not sure of that) or getting some kind of patch from Ascential (I'm also not sure of that), I don't know of a fix. You might try to restructure your jobs so that you can run multiple instances at once, then partition your data and send it through the multiple instances. Unfortunately, your job design doesn't always allow this.

Good Luck,
Tony