Page 1 of 1

improving performance

Posted: Fri May 13, 2011 1:46 am
by dnat
Hi,

I have several datastage jobs which run to update multiple dim and fact tables in a data warehousing application.

Most of the jobs take around 3 to 4 minutes to complete and 2 or 3 jobs take around 40 minutes to complete. Altogether it takes around 4 to 5 hours to complete. We have an SLA for these jobs to complete within 6 hours. Although most of the days it completes on time, 2 or 3 days in a month, we miss the SLA.

Also, sometimes the job aborts due to more CPU utilization, where the same unix server is used for multiple applications

On days when the job takes more than 6 hours, i am not able to find out what the reason is. The number of records seem to be almost same, and the log doesnt show anything indifferent.

I suspect two things

1. More CPU utilization--I am not sure whether this will affect the speed of the job, will it?

2. More DB usage. I am checking with the DBA, he hasnt responded yet

What are the other parameters we need to look for?

Posted: Fri May 13, 2011 2:09 am
by ray.wurlod
Other things happening on the machine at the same time is an often overlooked one.

Posted: Fri May 13, 2011 7:45 am
by greggknight
Yes if you cpu is saturated you will have degragation.

Just a thought.
I did a little testing once.
I have a 4core machine.

So I set up four config.apt
one node
two node
three node
four nodes.

I then ran a batch with some jobs in it on a one node and got my time.
I then ran on two nodes , the time was reduced almost by 50%
I ran on three nodes , the differerance from one node was not three times faster actuallyit was slightly faster then the two node by only a min or so.
I then ran a four nodeand performance actually went the other way.
Reason: because the more nodes you have the more processes that are spawned the more processes the more CPU consumption. You have to find the balance. Your configuration is a good starting place as well as what else is running. If other processes are running and your process max's cpu in it self then everytning will slow down.
Just a thought, I don't know your config but that was my testing results. I use 50% for 4 cores I use 2 nodes and 2 controllers for my resource disks. and a third for my scratch.
Just some thoughts.
You could look at some job designs of the slower jobs there could be some changes that could be done in there as well

Bottom line you just need to analyze the whole process.