Audit Check & Balance Layer

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
synsog
Premium Member
Premium Member
Posts: 232
Joined: Sun Aug 01, 2010 11:01 pm
Location: Pune

Audit Check & Balance Layer

Post by synsog »

Hi,

I understand as part of the ETL Architecture design we can have an Audit Check & Balance layer. The idea being to be able to reconcile between source record counts against the rejects + loaded records. Gathering Statistics at a link level can also be done via a script that pulls this data from DS.

Is there anything else we do as part of this layer and is something we always need to build as a best practice?

Thanks,
NV
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

It sounds like a nice concept. How would you plan to handle the "tricky parts" where rows counts will not cleanly add up, such as the various join and merge methods, lookups that can return multiple records per incoming row, lookups that drop non-matched records instead of rejecting them, remove duplicates stage, etc.?
Choose a job you love, and you will never have to work a day in your life. - Confucius
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Look up completeness. I don't recommend using link counts. I wrote a canned solution to audit ETL and it is included with EtlStats.
Mamu Kim
kwwilliams
Participant
Posts: 437
Joined: Fri Oct 21, 2005 10:00 pm

Post by kwwilliams »

Kim is correct link counts are a bad practice for audit/balancing. The link counts are what traveled down the link, not what was succesfully applied.

As far as other things, it depends on to what level you want to take these checks. I have done comparisons of all the data just loaded to all of the data in the source system - and then used a threshold to determine to what degree the data was good or bad. If it was less than the value of my job parameter (.99999) then the sequence would fail and alert data stewards to an issue. If you have scd's are you properly end dating previous rows in the table - make sure you check that as well. Anything you touch should be verified.
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Post by karthi_gana »

Hi Kim,
I wrote a canned solution to audit ETL and it is included with EtlStats.
http://www.duke-consulting.com/DataStage_Tips.htm

can you tell me will it work on 8.1 version too?
Karthik
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

It was written years ago but yes it should work fine. It is all server jobs because it was written before PX became popular. I hope someone other than me and Ron use it. It think it is cool. It requires EtlStats jobs to be setup just for the couple reports. There are examples of the reports. There is a video on how to setup EtlStats. I can easily help setup completeness. I also have blog somewhere on it.
Mamu Kim
Post Reply