Page 1 of 1

Best Auditing Approach in version 8.5 onwards

Posted: Sun Oct 20, 2013 11:57 pm
by SachinCho
Hi,
We have a requirement to capture datastage job level stats for each and every job in a application. The rough audit table structure we have is
DSJobName
Source_Rec_Count
Reject_Rec_Count - will be combining all rejects in various transformations using funnel in single output
Target_Rec_Count
Job_start_time
job_end_time
Job_status - finished, aborted, finished with warnings
Batch_id- Unique identifier for particular run

Which is the best way to get this kind of information. Two options I can see are
1. Link counts by routines and calling this routine after each and every job
2. Operations DB - which is available on latest versions of DS

As I am not much aware of operations DB, any one can throw more light on this and whether all this required information above is available in it??

Thanks
Sachin

Posted: Mon Oct 21, 2013 2:12 am
by ArndW
There are many ways one can approach this. At my current site, we wrote our own logging mechanism which is triggered by the "Before-Job" subroutine call in each job in the project.
Another, simpler, way would be to upgrade to 9.x for the DataStage logging mechanism.
Alternatively, look into Kim Duke's "etlstats" using the search mechanism here.

Posted: Tue Oct 22, 2013 5:40 am
by SachinCho
When you say before sub routine, does it mean you are accessing link counts to get source, target and reject records. I believe in that we should have before subroutine and after sub routine for each job.

Rgds,
Sachin

Posted: Tue Oct 22, 2013 5:58 am
by priyadarshikunal
You can do it in after job sub routine only. I think mentioning Before job subroutine was mistake unless before job subroutine triggers some monitoring scripts or so and then proceeds with next step. EtlStats is a good approach and you do not need to reinvent the wheels. And I think 9.1 will make this much easier.

Posted: Tue Oct 22, 2013 6:06 am
by ArndW
Since the after-job routine might not get called (imagine if the machine crashes during execution or the job is killed), the before-job routine call is the only reliable means of putting in a user-written auditing or logging system.

Posted: Tue Oct 22, 2013 8:38 am
by priyadarshikunal
ArndW wrote:only reliable means

True,unless you are doing a multi step logging.

Posted: Thu Oct 24, 2013 5:14 pm
by kduke
Use completeness reports to audit. Do a search. You compare counts in source tables against counts in target tables. If sources or targets are not tables then use wc -l for flat files otherwise get creative.

Re: Best Auditing Approach in version 8.5 onwards

Posted: Fri Oct 25, 2013 12:53 am
by Klaus Schaefer
Hi Sachin,

we do this running DSJobReport (report type 2/XML) for every job as an after job subroutine. It delivers everything you mentioned exept a Batch_id, which is your own thing. Later we process these XMLs to be stored into a DB table for auditing.

While this works pretty fine you have to be aware that there are some differences in the logging of EE and server jobs. DS_JOBS in the Universe repository lets you trace the type of job.

Another thing you have to ba aware of is that inserted/rejected rows to databases are not reported properly, depending on the settings of your data access stage. DS mainly counts the link flow records. It will report the full amound of records send to a DB link - even if these rows get rejected...

I haven't seen the new features of version 9 regarding reporting so far. Will be interesting to see what IBM has implemented meanwhile... ;-)

Best regards
Klaus

Posted: Mon Nov 04, 2013 9:20 am
by kduke
You have to have a reject link in a PX job to capture rejects. That is why this kind of auditing is worthless.

Posted: Mon Dec 02, 2013 5:19 am
by SachinCho
Thanks for the inputs !! As we are using v9.1 for our project operations DB will make life much easier.