Getting Job Run Statistics

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
DS_SUPPORT
Premium Member
Premium Member
Posts: 232
Joined: Fri Aug 04, 2006 1:20 am
Location: Bangalore

Getting Job Run Statistics

Post by DS_SUPPORT »

We have huge number of jobs which is already there in production, and now a requirement came to capture the job run statistics to a table.

We can do it , by writing a seperate job and call that in the after job subroutine, But this requires , changing all the jobs. I want to achieve this with minimum impact.

And already , in all our jobs we are using a shared container, for capturing the last run dttm, and it is called after the target insert. We have a seperate path for target update also.

So , I want to know ,if i write a routine and call that in a shared container, will solve my issue. I am having doubt here because, at run time, Shared Container, will also be treated as a part of job, and if i check for job status, it will show as running only.

what will be the best approach for handling this, with minimal impact?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The best approach is not to measure the thing being measured. Wait till the job finishes, then do the measuring.

The other best approach is to let someone else do the work for you. Download ETLstats from Kim Duke's website for free.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

How do you run your jobs? If you run using sequences then I recommend the last job run get row counts for all the jobs in the sequence. I recommend you run this in the background. Not wait for get row counts to come back otherwise you slow down your whole run. If you run 200 jobs and it takes 1 or 2 minutes per job to get row counts then this can add a couple hours to your overall load. Running in the background does not slow down your other jobs. Getting row counts is not a big load on the system.

If you run job through a shell script then we automatically run this job after every normal job. So it the script it calls a copy of itself. It also checks to make sure it is not running the get row counts job to make sure it does not get in an infinite loop calling itself over and over.

If you get row counts in an after job routine then it automatically has to wait for get rows counts to finish. So it automatically slows your overall load down. The second issue is what if you need to do something else in you after job routine like FTP a file then you have a problem. Another issue with this is job is still running when you are getting row counts so you do not know the overall run time. You also do not know the job status (if it failed or not). We get row stats on failed jobs too. That way we know how many failures we had in a month. Very important.
Mamu Kim
Post Reply