Hi,
I have several datastage jobs which run to update multiple dim and fact tables in a data warehousing application.
Most of the jobs take around 3 to 4 minutes to complete and 2 or 3 jobs take around 40 minutes to complete. Altogether it takes around 4 to 5 hours to complete. We have an SLA for these jobs to complete within 6 hours. Although most of the days it completes on time, 2 or 3 days in a month, we miss the SLA.
Also, sometimes the job aborts due to more CPU utilization, where the same unix server is used for multiple applications
On days when the job takes more than 6 hours, i am not able to find out what the reason is. The number of records seem to be almost same, and the log doesnt show anything indifferent.
I suspect two things
1. More CPU utilization--I am not sure whether this will affect the speed of the job, will it?
2. More DB usage. I am checking with the DBA, he hasnt responded yet
What are the other parameters we need to look for?
improving performance
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Premium Member
- Posts: 120
- Joined: Thu Oct 28, 2004 4:24 pm
Yes if you cpu is saturated you will have degragation.
Just a thought.
I did a little testing once.
I have a 4core machine.
So I set up four config.apt
one node
two node
three node
four nodes.
I then ran a batch with some jobs in it on a one node and got my time.
I then ran on two nodes , the time was reduced almost by 50%
I ran on three nodes , the differerance from one node was not three times faster actuallyit was slightly faster then the two node by only a min or so.
I then ran a four nodeand performance actually went the other way.
Reason: because the more nodes you have the more processes that are spawned the more processes the more CPU consumption. You have to find the balance. Your configuration is a good starting place as well as what else is running. If other processes are running and your process max's cpu in it self then everytning will slow down.
Just a thought, I don't know your config but that was my testing results. I use 50% for 4 cores I use 2 nodes and 2 controllers for my resource disks. and a third for my scratch.
Just some thoughts.
You could look at some job designs of the slower jobs there could be some changes that could be done in there as well
Bottom line you just need to analyze the whole process.
Just a thought.
I did a little testing once.
I have a 4core machine.
So I set up four config.apt
one node
two node
three node
four nodes.
I then ran a batch with some jobs in it on a one node and got my time.
I then ran on two nodes , the time was reduced almost by 50%
I ran on three nodes , the differerance from one node was not three times faster actuallyit was slightly faster then the two node by only a min or so.
I then ran a four nodeand performance actually went the other way.
Reason: because the more nodes you have the more processes that are spawned the more processes the more CPU consumption. You have to find the balance. Your configuration is a good starting place as well as what else is running. If other processes are running and your process max's cpu in it self then everytning will slow down.
Just a thought, I don't know your config but that was my testing results. I use 50% for 4 cores I use 2 nodes and 2 controllers for my resource disks. and a third for my scratch.
Just some thoughts.
You could look at some job designs of the slower jobs there could be some changes that could be done in there as well
Bottom line you just need to analyze the whole process.
"Don't let the bull between you and the fence"
Thanks
Gregg J Knight
"Never Never Never Quit"
Winston Churchill
Thanks
Gregg J Knight
"Never Never Never Quit"
Winston Churchill