Error Processing

pradkumar · Post by **pradkumar** » Tue Jan 02, 2007 7:11 pm

Hi

What is an Error Processing table and Audit table.
Plz let me know the concept behind these tables.
How these tables could be built?

ray.wurlod · Post by **ray.wurlod** » Tue Jan 02, 2007 7:55 pm

Something a site chooses to implement.

Nothing to do with DataStage, though that may be the tool used to populate these tables.

My guesses would be as follows.

Error processing table records any errors generated within ETL processing.

Audit table is used to store row count reconciliations.

kumar_s · Post by **kumar_s** » Wed Jan 03, 2007 4:09 am

Hi pradkumar,
You can let us know, what is this regards to.
It can be either the either the normal naming conventions used in your site or project for the details given by Ray or it can be referred to the database internal tables.

pradkumar · Post by **pradkumar** » Wed Jan 03, 2007 4:26 pm

This is for keeping track of errors. This is a good practice for reconciliation and also to know what errors are present and how to reprocess them.
I did not build one until one.
I need to get more clear picture of how to build it and populate it

kumar_s · Post by **kumar_s** » Wed Jan 03, 2007 4:30 pm

As mentioned earlier, its one of the process you like to follow in you development. Error processing tables will mostly be the one which collect all the rejects with the error code. There are several approaches, based on you methodology. Its more flexible if you stored procedure.
Audit table will be the one, which will have the total row count, start time, end time of all the tables while data population (Including Error processing Tables).

ray.wurlod · Post by **ray.wurlod** » Wed Jan 03, 2007 5:10 pm

You build these tables according to the specifications of your customer. What information do they want recorded? Once you have those answers, you can figure out how to capture these from DataStage and elsewhere, and create tasks for recording them into the tables.

pradkumar · Post by **pradkumar** » Thu Jan 04, 2007 2:03 am

Right now I am trying to build an audit table which consists of the following information:

1) Batch number of the daily run
2) Extraction start time
3) Extraction end time
4) Time elapsed

SO where to insert this link in the job and how to populate these columns. This table will be an ouput link from Transformer (I guess)

kumar_s · Post by **kumar_s** » Thu Jan 04, 2007 2:08 am

Not necessarily, it can be even on Before/After SQLs. In the before SQLs of the input stage, you can insert the record with the start time as SYSDATE and current batch id as Jobparameter which is passed from JobSequence or the Main job control. And in the After job control populate you can populate the SYSDATE for End time and StartDate - EndDate as ElaspsedTime.

ray.wurlod · Post by **ray.wurlod** » Thu Jan 04, 2007 4:49 am

You never want to do it IN the job.

Do it afterwards, so that you don't slow the job itself.

pradkumar · Post by **pradkumar** » Thu Jan 04, 2007 11:46 am

Ray,

In ur reply "Do it afterwards, so that you don't slow the job itself. "

Afterwards means in the target stage I should start populating my audit table. Or is something else you were mentioning about

Pavan_Yelugula · Post by **Pavan_Yelugula** » Thu Jan 04, 2007 12:09 pm

Hi
Ray is just suggesting you to capture the audit information outside the actual Job(Main Logic whatever you are doing) coz if you include the audit capturing code in the actual job it will slow your actual job

. hope you got it...

Thanks
Pavan

ray.wurlod · Post by **ray.wurlod** » Thu Jan 04, 2007 3:02 pm

I'd always advocate a separate job. Of course, you may need to detect and trap error information in the job itself, for such is the nature of these things, but simply redirect the error information into a file. The row counts and started/stopped times can be obtained subsequently. Use the times from active stages rather than from jobs (or both) if you're interested in discounting startup and shutdown overheads.