Page 1 of 1

Audit/Control Totals

Posted: Fri Sep 16, 2011 12:31 pm
by bobyon
I am working on a project to replace functionality that is currently in what has been called a job Wrapper. It is in a job named NewSequence. The actual code is DSBasic in the job control section of the job. I mention that because I am certain some folks will recognize it.

I am still working my way through the code attempting to understand all of it's functionality, but, one of the functions is to collect and store some control totals (rowcounts and such).

My questions is: What is considered current best practice for gathering control totals; row counts and/or summary totals.
How do you do it?

Thanks,
Bob

Posted: Sat Sep 17, 2011 8:49 am
by kduke
Run time stats are useful in a few ways. Trending is one way. Are my jobs running slower? How do you answer this question otherwise? Stats can show you trends. Rows/second can also compare 2 different jobs. Need to know stage types for this one. So if you are comparing DB2 rows/sec to Oracle then you need very simple jobs like flat file to insert. If you are comparing DB2 to DB2 then it is a little easier. Remember rows/sec slow down to the slowest link. So if you have aggregate stages or other more time consuming stages then DB2 may not be your slowest link.

If you store your stats at the partition level then you see if your partitions are balanced, meaning do they all have the same number of nodes.

ETL stats can easily report the job run order. If you have complex sequences then you can see what jobs run in what order. If you associate files or tables with your jobs then you can see if a dependent file gets loaded before it is consumed. I found a few mistakes this way. So data was being loaded with yesterdays data.

ETL stats are useful only if you decide to how you want to use them. Then look at them on a regular basis. Are my jobs running slower? If so then why?