Hi, I am new to DS so please forgive the naive question. I have a job which read and writes over 12 million rows which normally runs at over 1000 rows/sec but occasionally (apparently randomly) appears to run much slower at around 60 rows\sec (causing the job to run for over 9 hrs).
Can anyone suggest where I should begin to look to investigate the cause of this as there is no network issue and no other change in scheduled jobs running on the server.
Slow running job
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Thanks for the swift response, Ray, Mayura.
Well, the scenario is that I have run the job during the day when it takes around 16 mins. At this time the load on the DS box and associated DB2 Data Warehouse is low-ish. When the same job runs in the same place in its scheduled spot at around 23:00, when a number of other jobs are running (but not against the same tables), the run time has been 9 hrs.
Now, I can allow for the fact that this run time could be a busy one and allow for a slightly slowers run time but, eqaully, when the DS box goes quiet at around 03:00 I would expect the job to complete quickly. Instead it continues to run at its slow 60 rows/second rate (as against 2000+ normally).
Well, the scenario is that I have run the job during the day when it takes around 16 mins. At this time the load on the DS box and associated DB2 Data Warehouse is low-ish. When the same job runs in the same place in its scheduled spot at around 23:00, when a number of other jobs are running (but not against the same tables), the run time has been 9 hrs.
Now, I can allow for the fact that this run time could be a busy one and allow for a slightly slowers run time but, eqaully, when the DS box goes quiet at around 03:00 I would expect the job to complete quickly. Instead it continues to run at its slow 60 rows/second rate (as against 2000+ normally).
Thanks for the swift response, Ray, Mayura.
Well, the scenario is that I have run the job during the day when it takes around 16 mins. At this time the load on the DS box and associated DB2 Data Warehouse is low-ish. When the same job runs in the same place in its scheduled spot at around 23:00, when a number of other jobs are running (but not against the same tables), the run time has been 9 hrs.
Now, I can allow for the fact that this run time could be a busy one and allow for a slightly slowers run time but, eqaully, when the DS box goes quiet at around 03:00 I would expect the job to complete quickly. Instead it continues to run at its slow 60 rows/second rate (as against 2000+ normally).
Well, the scenario is that I have run the job during the day when it takes around 16 mins. At this time the load on the DS box and associated DB2 Data Warehouse is low-ish. When the same job runs in the same place in its scheduled spot at around 23:00, when a number of other jobs are running (but not against the same tables), the run time has been 9 hrs.
Now, I can allow for the fact that this run time could be a busy one and allow for a slightly slowers run time but, eqaully, when the DS box goes quiet at around 03:00 I would expect the job to complete quickly. Instead it continues to run at its slow 60 rows/second rate (as against 2000+ normally).
You need to give more details on this job, what are the different stages used?
Is it reading from a sequential file or a database?
What stage is the data being written into?
Are there any hashed files used in this job?
By what you have mentioned its very clear that the job runs slow mainly because of overload on server (resource being overloaded)
Few Tips
If reading data from Hashed files, preload them to memory (Stage properties)
If writing data to a db, Increase the "Array Size" from default 1 to more (around 1000 or as per convinience), also keep the "Transaction Size" to around 250.
Enable "In process" row buffering for the job, this may help a little.
Check if there are lot of Stage variables used see if you can maintain the transformations and logic by avoiding them and figure out a work around.
Usually job performance can be improved just by proper resource handling before tweaking the code since that will be less invasive, but if you have to change the code for some reason then you have to.
Is it reading from a sequential file or a database?
What stage is the data being written into?
Are there any hashed files used in this job?
By what you have mentioned its very clear that the job runs slow mainly because of overload on server (resource being overloaded)
Few Tips
If reading data from Hashed files, preload them to memory (Stage properties)
If writing data to a db, Increase the "Array Size" from default 1 to more (around 1000 or as per convinience), also keep the "Transaction Size" to around 250.
Enable "In process" row buffering for the job, this may help a little.
Check if there are lot of Stage variables used see if you can maintain the transformations and logic by avoiding them and figure out a work around.
Usually job performance can be improved just by proper resource handling before tweaking the code since that will be less invasive, but if you have to change the code for some reason then you have to.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: