Page 1 of 1

Performance problems-running muliple jobs in parallel

Posted: Wed May 14, 2008 6:27 pm
by verify
we are using UNIX programs to invoke datastage jobs. There are 300 jobs that load different types of sequential flat files into oracle database. All these flat files are picked sequentially by datastage i.e file2 is job starts only after file1 is processed. We have noticed that the server CPU utilization is not crossing more than 5% during the execution of jobs. 95% of the CPU is sitting idle.
Is there a way i can call 10 instances of my UNIX program that invokes datstage jobs to improve performance? or please suggest us any better way of achieving this.

Posted: Wed May 14, 2008 9:22 pm
by dnsjain
You can improve the performance by calling multiple jobs at the same time so instead of running one job at a time you run multiple ones. You can achieve this some of these ways:

1. Define sequence job and in the sequence job start multiple job at once.
2. If you are running same job multiple times make the job multiple instance enabled and call the same job multiple times with different instance ID in the sequence job.
3. Schedule multiple jobs at once.

Posted: Wed May 14, 2008 9:59 pm
by chulett
Get yourself some real job control, something that will allow you to define dependancies and will keep as many jobs running simultaneously as possible. Something like... oh, I don't know... our Ken Bland's Job Control Utilities that he gives away free from his website.

Posted: Fri May 16, 2008 6:29 am
by ray.wurlod
Take an Economics 101 class - particular where they talk about supply and demand.

You must monitor your jobs under normal (baseline) conditions and perhaps running in isolation under full data load then, from these measures, calculate the maximum demand you can place on the system, and schedule accordingly.

The Resource Estimator tool you get in version 8.0 is particularly useful in this task. It will estimate resource consumption either from the job design or by running a sample of rows. Make sure it is a statistically meaningful sample size if you go this route.