Page 1 of 1

Audit Check & Balance Layer

Posted: Fri Mar 23, 2012 8:17 am
by synsog
Hi,

I understand as part of the ETL Architecture design we can have an Audit Check & Balance layer. The idea being to be able to reconcile between source record counts against the rejects + loaded records. Gathering Statistics at a link level can also be done via a script that pulls this data from DS.

Is there anything else we do as part of this layer and is something we always need to build as a best practice?

Thanks,
NV

Posted: Fri Mar 23, 2012 10:01 am
by qt_ky
It sounds like a nice concept. How would you plan to handle the "tricky parts" where rows counts will not cleanly add up, such as the various join and merge methods, lookups that can return multiple records per incoming row, lookups that drop non-matched records instead of rejecting them, remove duplicates stage, etc.?

Posted: Thu Mar 29, 2012 6:54 am
by kduke
Look up completeness. I don't recommend using link counts. I wrote a canned solution to audit ETL and it is included with EtlStats.

Posted: Thu Mar 29, 2012 10:46 am
by kwwilliams
Kim is correct link counts are a bad practice for audit/balancing. The link counts are what traveled down the link, not what was succesfully applied.

As far as other things, it depends on to what level you want to take these checks. I have done comparisons of all the data just loaded to all of the data in the source system - and then used a threshold to determine to what degree the data was good or bad. If it was less than the value of my job parameter (.99999) then the sequence would fail and alert data stewards to an issue. If you have scd's are you properly end dating previous rows in the table - make sure you check that as well. Anything you touch should be verified.

Posted: Tue Apr 03, 2012 11:05 pm
by karthi_gana
Hi Kim,
I wrote a canned solution to audit ETL and it is included with EtlStats.
http://www.duke-consulting.com/DataStage_Tips.htm

can you tell me will it work on 8.1 version too?

Posted: Thu Apr 05, 2012 3:11 pm
by kduke
It was written years ago but yes it should work fine. It is all server jobs because it was written before PX became popular. I hope someone other than me and Ron use it. It think it is cool. It requires EtlStats jobs to be setup just for the couple reports. There are examples of the reports. There is a video on how to setup EtlStats. I can easily help setup completeness. I also have blog somewhere on it.