Hi,
I understand as part of the ETL Architecture design we can have an Audit Check & Balance layer. The idea being to be able to reconcile between source record counts against the rejects + loaded records. Gathering Statistics at a link level can also be done via a script that pulls this data from DS.
Is there anything else we do as part of this layer and is something we always need to build as a best practice?
Thanks,
NV
Audit Check & Balance Layer
Moderators: chulett, rschirm, roy
It sounds like a nice concept. How would you plan to handle the "tricky parts" where rows counts will not cleanly add up, such as the various join and merge methods, lookups that can return multiple records per incoming row, lookups that drop non-matched records instead of rejecting them, remove duplicates stage, etc.?
Choose a job you love, and you will never have to work a day in your life. - Confucius
-
- Participant
- Posts: 437
- Joined: Fri Oct 21, 2005 10:00 pm
Kim is correct link counts are a bad practice for audit/balancing. The link counts are what traveled down the link, not what was succesfully applied.
As far as other things, it depends on to what level you want to take these checks. I have done comparisons of all the data just loaded to all of the data in the source system - and then used a threshold to determine to what degree the data was good or bad. If it was less than the value of my job parameter (.99999) then the sequence would fail and alert data stewards to an issue. If you have scd's are you properly end dating previous rows in the table - make sure you check that as well. Anything you touch should be verified.
As far as other things, it depends on to what level you want to take these checks. I have done comparisons of all the data just loaded to all of the data in the source system - and then used a threshold to determine to what degree the data was good or bad. If it was less than the value of my job parameter (.99999) then the sequence would fail and alert data stewards to an issue. If you have scd's are you properly end dating previous rows in the table - make sure you check that as well. Anything you touch should be verified.
Keith Williams
keith@peacefieldinc.com
keith@peacefieldinc.com
-
- Premium Member
- Posts: 729
- Joined: Tue Apr 28, 2009 10:49 pm
Hi Kim,
can you tell me will it work on 8.1 version too?
http://www.duke-consulting.com/DataStage_Tips.htmI wrote a canned solution to audit ETL and it is included with EtlStats.
can you tell me will it work on 8.1 version too?
Karthik
It was written years ago but yes it should work fine. It is all server jobs because it was written before PX became popular. I hope someone other than me and Ron use it. It think it is cool. It requires EtlStats jobs to be setup just for the couple reports. There are examples of the reports. There is a video on how to setup EtlStats. I can easily help setup completeness. I also have blog somewhere on it.
Mamu Kim