Capturing historical Job run statistics

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
sumanth.punyamurthula@etr
Participant
Posts: 6
Joined: Tue Oct 20, 2009 10:18 am

Capturing historical Job run statistics

Post by sumanth.punyamurthula@etr »

We have more than 500 datastage ETL jobs which run in the night to load data from various sources.

I am interested in a daily exception report which compares no. of rows loaded by the same job compared to

1) say previous run or

2) Average of previous 3 runs

and then the report should only report those jobs which have more than say a 20% variance in no of rows loaded compared to previous average


3) I would also like to have a threshold to only report the 20% variance for those jobs which load at least 100 rows or more or

4) Highlight those jobs which loaded 0 rows


Is any of this possible
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Sure. Once you start capturing those statistics you can do pretty much anything that you want with them. I'm guessing the next question will be along the lines of how are they captured?

You have two choices - from the command line or via job control. The choice is typically driven by what you are most comfortable with building and maintaining, scripts or DataStage jobs / routines. For the former you would leverage the dsjob command and the options there are mirrored by the BASIC API for the same functions, which would be the latter approach:

dsjob -ljobs
dsjob -logsum
dsjob -logdetail
dsjob -lownewest

DSGetProjectInfo
DSGetLogSum
DSGetLogEntry
DSGetNewestLogId

There should be a Command Line Interface chapter in the Server Job Developer's Guide pdf that explains all of the dsjob options and the functions are in the online help.

Some people hook a process like this to every job, others take one pass through all jobs as an 'after batch' post process, either can be made to work. Then log what you capture into an RDBMS and you can do whatever reporting you like after that. You might also want to check out the freely downloadable ETLStats package at Kim Duke's website where he's already done quite a bit of the work for you. You can take it and adapt it to what you need or simply study it and then build your own version.
Last edited by chulett on Fri Oct 30, 2009 8:15 pm, edited 1 time in total.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

This question was originally an email to me. I asked that it be posted here so others canbenefit from the answers. Here is my original reply.

I worked at Hotels.com. We had similar reports. You need to run EtlStats after every run to get your row counts. We ran an exception report based on standard deviation. This tended to not be accurate, too many exceptions. We changed it to be based on day of week. So we used all the row counts for all jobs ran on Monday. A web based service tends to have patterns based on day of the week. Weekends maybe more or less business. Some jobs may fail if there are no rows. You need to watch for that as well.
Mamu Kim
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

This question has been asked several times on this forum. So a search may help get a more complete answer.

EtlStats is a set of server jobs which extract row counts from

dsjob -report XML

command. It needs to write these results in tables. There are create table DDL included for about 4 or more databases like Oracle, SQL Server, DB2 and more. The primary keys keep these tables from having duplicates so you can run these jobs more than once. The primary job gets row counts for one job and one instance at a time. It works whether or not you use instances. There is a batch job which will feed all jobs into this one job. It can also feed only jobs in a sequence or a folder. It is very flexible.

You can use these tables with the row counts to run these reports. What you are doing is called a "Sanity Check" or "Completeness Report". There are many ways to audit your target tables. Do a search for completeness or sanity check and you will see more posts on this topic.
Mamu Kim
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

EtlStats has a different version for PX jobs. The row counts for PX jobs are at the partition level. You have 2 choices you can aggregate these to the link name level or you can add partition number to your key on the ETL_ROW_HIST table. Partition number is blank or null or a server job so you need to scrub this to a zero in your transform. This is an easy fix.

We stored all the detail at my last job for PX jobs. This is so we could go back and check to see if the partitions were balanced. If you sort your data on 8 partitions and only 2 have row counts then you need to change your job. So you can easily do performance tuning by looking at these tables.
Mamu Kim
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

We have found other issues using these tables. We had duplicate rows in our target tables because our lookups were returning multiple rows. We found the jobs quickly by looking at the rows coming into a lookup and not matching rows coming out of the same lookup.
Mamu Kim
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Thank you Uncle Kim. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
sumanth.punyamurthula@etr
Participant
Posts: 6
Joined: Tue Oct 20, 2009 10:18 am

Post by sumanth.punyamurthula@etr »

Here is some more info I received from our team. It is not so simple it looks like

To generate these statistics as part of Director job log, we need to set $APT_RECORD_COUNT to TRUE in the job parameter (or can set it globally in DS Administrator). This will "echo out" the record consumed or process by each operator/stage and will print it as job log data. Even in the wrapper script we need to change the code for DS Log from "-logsum" (log just the summary data) to "-logdetail" (get me the summary data along with details too). The generated log file will be detailed and will be comparatively larger than the existing log file.

There is 1 challenge, this option will generate the "records consumed" message only for parallel stages and not for non-parallel (or utility) stage. To be very specific, the log is not having load information for DB2 Bulk-load utility/stage
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

No-one ever said it would be simple. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Unless you have jobs without warnings then the log file can be huge. I would not do logdetail. We had lots of issues with this. If you are using a scheduler like Control-M then you can report back to these products too many warnings.
Last edited by kduke on Tue Oct 27, 2009 12:05 pm, edited 1 time in total.
Mamu Kim
sumanth.punyamurthula@etr
Participant
Posts: 6
Joined: Tue Oct 20, 2009 10:18 am

Post by sumanth.punyamurthula@etr »

Sorry this is my first post on this forum - I cannot read premier content - so cannot read your reply
sumanth.punyamurthula@etr
Participant
Posts: 6
Joined: Tue Oct 20, 2009 10:18 am

Post by sumanth.punyamurthula@etr »

I am very new to this forum. Canot read premier content. And hence cannot read your reply to my post.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

premium_membership.php

My first reply really should have been marked as Premium content as well.
-craig

"You can never have too many knives" -- Logan Nine Fingers
GJRey
Premium Member
Premium Member
Posts: 4
Joined: Thu Sep 18, 2008 11:27 am

Post by GJRey »

chulett wrote:premium_membership.php

My first reply really should have been marked as Premium content as well.
Then the reply you did want him to see was marked as Premium Content.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

And your point is? The one you're referring to wasn't long enough to be adversely affected by the Premium Membership restrictions. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply