Page 1 of 1

Slow running job

Posted: Tue Jun 08, 2010 5:26 am
by Parcival
Hi, I am new to DS so please forgive the naive question. I have a job which read and writes over 12 million rows which normally runs at over 1000 rows/sec but occasionally (apparently randomly) appears to run much slower at around 60 rows\sec (causing the job to run for over 9 hrs).

Can anyone suggest where I should begin to look to investigate the cause of this as there is no network issue and no other change in scheduled jobs running on the server.

Posted: Tue Jun 08, 2010 5:58 am
by ray.wurlod
Welcome aboard.

What's different about the system, and other systems, during the slow times? "Nothing" clearly is not the correct answer.

Posted: Tue Jun 08, 2010 6:30 am
by mayura
ray.wurlod wrote:Welcome aboard.

What's different about the system, and other systems, during the slow times? "Nothing" clearly is not the correct answer.
what are the operations are u doing in jobs... please explain bit.. so we can find the solutions

:idea:

Posted: Tue Jun 08, 2010 7:57 am
by Parcival
Thanks for the swift response, Ray, Mayura.

Well, the scenario is that I have run the job during the day when it takes around 16 mins. At this time the load on the DS box and associated DB2 Data Warehouse is low-ish. When the same job runs in the same place in its scheduled spot at around 23:00, when a number of other jobs are running (but not against the same tables), the run time has been 9 hrs.

Now, I can allow for the fact that this run time could be a busy one and allow for a slightly slowers run time but, eqaully, when the DS box goes quiet at around 03:00 I would expect the job to complete quickly. Instead it continues to run at its slow 60 rows/second rate (as against 2000+ normally).

Posted: Tue Jun 08, 2010 7:58 am
by Parcival
Thanks for the swift response, Ray, Mayura.

Well, the scenario is that I have run the job during the day when it takes around 16 mins. At this time the load on the DS box and associated DB2 Data Warehouse is low-ish. When the same job runs in the same place in its scheduled spot at around 23:00, when a number of other jobs are running (but not against the same tables), the run time has been 9 hrs.

Now, I can allow for the fact that this run time could be a busy one and allow for a slightly slowers run time but, eqaully, when the DS box goes quiet at around 03:00 I would expect the job to complete quickly. Instead it continues to run at its slow 60 rows/second rate (as against 2000+ normally).

Posted: Fri Jun 18, 2010 2:39 pm
by DSShishya
You need to give more details on this job, what are the different stages used?
Is it reading from a sequential file or a database?
What stage is the data being written into?
Are there any hashed files used in this job?

By what you have mentioned its very clear that the job runs slow mainly because of overload on server (resource being overloaded)

Few Tips

If reading data from Hashed files, preload them to memory (Stage properties)

If writing data to a db, Increase the "Array Size" from default 1 to more (around 1000 or as per convinience), also keep the "Transaction Size" to around 250.

Enable "In process" row buffering for the job, this may help a little.

Check if there are lot of Stage variables used see if you can maintain the transformations and logic by avoiding them and figure out a work around.

Usually job performance can be improved just by proper resource handling before tweaking the code since that will be less invasive, but if you have to change the code for some reason then you have to.

Posted: Fri Jun 18, 2010 4:48 pm
by ray.wurlod
You still need to report what's different (about the system, databases, DataStage, ... anything) between when the job runs fast and when the job runs slowly.