Structuring a Server Job to Delete/Prune Input Records

ffillmorejr · Post by **ffillmorejr** » Wed Jul 28, 2010 10:09 am

I have an ODBC Stage that SELECTs rows from a System i table/physical file. This feeds a Transformer stage that puts the rows on respective INSERT/UPDATE/DELETE links to a second ODBC Stage writing to another database. Once all of the records have been processed in the second ODBC Stage, I would like to delete ("prune") all of the records I have read from the original System i table/physical file.

The Transformer Stage (in the middle) has a value used to identify the records to be deleted. I need to have the Transformer to feed that value to a third ODBC Stage to process the deletion, but only after all of the records have been processed in the target database (second ODBC Stage). I know how to pass the value from the Transformer. How do I know when the target database records have all been processed? How can I tell when the second ODBC Stage has finished successfully processing all of its input records?

anbu · Post by **anbu** » Wed Jul 28, 2010 10:28 am

I dont think you can read and delete the same table in a job. Our forum experts can confirm this.

Instead you can write the key fields to a file or hash file and create another job to use it and delete rows from your source table

kris007 · Post by **kris007** » Wed Jul 28, 2010 10:41 am

If your source is a table, you will have to split your process into two jobs to delete the data from the source. If your source is a physical file then you can write your flagged records from the transformer stage into a sequential file and overwrite your source file as part of your after job subroutine.

ArndW · Post by **ArndW** » Wed Jul 28, 2010 10:49 am

You can have your transform stage output records to a flat file; then have an output link on that file which won't get executed until the last row has been written to it.

kris007 · Post by **kris007** » Wed Jul 28, 2010 11:16 am

But the OP wanted all the records to get inserted/updated into the target table before the process can start deleting the records from the source table.

arunkumarmm · Post by **arunkumarmm** » Wed Jul 28, 2010 11:32 am

Best way is to do the delete in a sep. job.

Sainath.Srinivasan · Post by **Sainath.Srinivasan** » Wed Jul 28, 2010 2:48 pm

First-and-foremost, is the source table used by any other system ? If so, modifying its content will impact them. Have you discussed your method with such potential system's owners ?

Second, if the source updated 24 x 7. If so, what about the records which may be modified by the time your system selects from it ? You may loose them.

Best method will be to include a 'modified timestamp' in the source table populated either by the systems creating the source or as a trigger and using this value in your extraction. This way, you do not hinder the source users and ensure that all records are accessed.

This will help for future audit also. I am sure you will face this.

ffillmorejr · Post by **ffillmorejr** » Tue Aug 10, 2010 7:31 am

Thank you to all who responded to this post. I have blogged on the integration of InfoSphere Change Data Capture and DataStage - including the solution to the pruning question I had posted here.

http://www.thefillmoregroup.com/blog/?p=369

DSXchange

Structuring a Server Job to Delete/Prune Input Records

Structuring a Server Job to Delete/Prune Input Records

Re: Structuring a Server Job to Delete/Prune Input Records

Solution Found!