Best Auditing Approach in version 8.5 onwards

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
SachinCho
Participant
Posts: 45
Joined: Thu Jan 14, 2010 1:23 am
Location: Pune

Best Auditing Approach in version 8.5 onwards

Post by SachinCho »

Hi,
We have a requirement to capture datastage job level stats for each and every job in a application. The rough audit table structure we have is
DSJobName
Source_Rec_Count
Reject_Rec_Count - will be combining all rejects in various transformations using funnel in single output
Target_Rec_Count
Job_start_time
job_end_time
Job_status - finished, aborted, finished with warnings
Batch_id- Unique identifier for particular run

Which is the best way to get this kind of information. Two options I can see are
1. Link counts by routines and calling this routine after each and every job
2. Operations DB - which is available on latest versions of DS

As I am not much aware of operations DB, any one can throw more light on this and whether all this required information above is available in it??

Thanks
Sachin
Sachin C
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

There are many ways one can approach this. At my current site, we wrote our own logging mechanism which is triggered by the "Before-Job" subroutine call in each job in the project.
Another, simpler, way would be to upgrade to 9.x for the DataStage logging mechanism.
Alternatively, look into Kim Duke's "etlstats" using the search mechanism here.
SachinCho
Participant
Posts: 45
Joined: Thu Jan 14, 2010 1:23 am
Location: Pune

Post by SachinCho »

When you say before sub routine, does it mean you are accessing link counts to get source, target and reject records. I believe in that we should have before subroutine and after sub routine for each job.

Rgds,
Sachin
Sachin C
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

You can do it in after job sub routine only. I think mentioning Before job subroutine was mistake unless before job subroutine triggers some monitoring scripts or so and then proceeds with next step. EtlStats is a good approach and you do not need to reinvent the wheels. And I think 9.1 will make this much easier.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Since the after-job routine might not get called (imagine if the machine crashes during execution or the job is killed), the before-job routine call is the only reliable means of putting in a user-written auditing or logging system.
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

ArndW wrote:only reliable means

True,unless you are doing a multi step logging.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Use completeness reports to audit. Do a search. You compare counts in source tables against counts in target tables. If sources or targets are not tables then use wc -l for flat files otherwise get creative.
Mamu Kim
Klaus Schaefer
Participant
Posts: 94
Joined: Wed May 08, 2002 8:44 am
Location: Germany
Contact:

Re: Best Auditing Approach in version 8.5 onwards

Post by Klaus Schaefer »

Hi Sachin,

we do this running DSJobReport (report type 2/XML) for every job as an after job subroutine. It delivers everything you mentioned exept a Batch_id, which is your own thing. Later we process these XMLs to be stored into a DB table for auditing.

While this works pretty fine you have to be aware that there are some differences in the logging of EE and server jobs. DS_JOBS in the Universe repository lets you trace the type of job.

Another thing you have to ba aware of is that inserted/rejected rows to databases are not reported properly, depending on the settings of your data access stage. DS mainly counts the link flow records. It will report the full amound of records send to a DB link - even if these rows get rejected...

I haven't seen the new features of version 9 regarding reporting so far. Will be interesting to see what IBM has implemented meanwhile... ;-)

Best regards
Klaus
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

You have to have a reject link in a PX job to capture rejects. That is why this kind of auditing is worthless.
Mamu Kim
SachinCho
Participant
Posts: 45
Joined: Thu Jan 14, 2010 1:23 am
Location: Pune

Post by SachinCho »

Thanks for the inputs !! As we are using v9.1 for our project operations DB will make life much easier.
Sachin C
Post Reply