Audit/Control Totals

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
bobyon
Premium Member
Premium Member
Posts: 200
Joined: Tue Mar 02, 2004 10:25 am
Location: Salisbury, NC

Audit/Control Totals

Post by bobyon »

I am working on a project to replace functionality that is currently in what has been called a job Wrapper. It is in a job named NewSequence. The actual code is DSBasic in the job control section of the job. I mention that because I am certain some folks will recognize it.

I am still working my way through the code attempting to understand all of it's functionality, but, one of the functions is to collect and store some control totals (rowcounts and such).

My questions is: What is considered current best practice for gathering control totals; row counts and/or summary totals.
How do you do it?

Thanks,
Bob
Bob
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Run time stats are useful in a few ways. Trending is one way. Are my jobs running slower? How do you answer this question otherwise? Stats can show you trends. Rows/second can also compare 2 different jobs. Need to know stage types for this one. So if you are comparing DB2 rows/sec to Oracle then you need very simple jobs like flat file to insert. If you are comparing DB2 to DB2 then it is a little easier. Remember rows/sec slow down to the slowest link. So if you have aggregate stages or other more time consuming stages then DB2 may not be your slowest link.

If you store your stats at the partition level then you see if your partitions are balanced, meaning do they all have the same number of nodes.

ETL stats can easily report the job run order. If you have complex sequences then you can see what jobs run in what order. If you associate files or tables with your jobs then you can see if a dependent file gets loaded before it is consumed. I found a few mistakes this way. So data was being loaded with yesterdays data.

ETL stats are useful only if you decide to how you want to use them. Then look at them on a regular basis. Are my jobs running slower? If so then why?
Mamu Kim
Post Reply