Page 1 of 1

Structuring a Server Job to Delete/Prune Input Records

Posted: Wed Jul 28, 2010 10:09 am
by ffillmorejr
I have an ODBC Stage that SELECTs rows from a System i table/physical file. This feeds a Transformer stage that puts the rows on respective INSERT/UPDATE/DELETE links to a second ODBC Stage writing to another database. Once all of the records have been processed in the second ODBC Stage, I would like to delete ("prune") all of the records I have read from the original System i table/physical file.

The Transformer Stage (in the middle) has a value used to identify the records to be deleted. I need to have the Transformer to feed that value to a third ODBC Stage to process the deletion, but only after all of the records have been processed in the target database (second ODBC Stage). I know how to pass the value from the Transformer. How do I know when the target database records have all been processed? How can I tell when the second ODBC Stage has finished successfully processing all of its input records?

Posted: Wed Jul 28, 2010 10:28 am
by anbu
I dont think you can read and delete the same table in a job. Our forum experts can confirm this.

Instead you can write the key fields to a file or hash file and create another job to use it and delete rows from your source table

Re: Structuring a Server Job to Delete/Prune Input Records

Posted: Wed Jul 28, 2010 10:41 am
by kris007
If your source is a table, you will have to split your process into two jobs to delete the data from the source. If your source is a physical file then you can write your flagged records from the transformer stage into a sequential file and overwrite your source file as part of your after job subroutine.

Posted: Wed Jul 28, 2010 10:49 am
by ArndW
You can have your transform stage output records to a flat file; then have an output link on that file which won't get executed until the last row has been written to it.

Posted: Wed Jul 28, 2010 11:16 am
by kris007
But the OP wanted all the records to get inserted/updated into the target table before the process can start deleting the records from the source table.

Posted: Wed Jul 28, 2010 11:32 am
by arunkumarmm
Best way is to do the delete in a sep. job.

Posted: Wed Jul 28, 2010 2:48 pm
by Sainath.Srinivasan
First-and-foremost, is the source table used by any other system ? If so, modifying its content will impact them. Have you discussed your method with such potential system's owners ?

Second, if the source updated 24 x 7. If so, what about the records which may be modified by the time your system selects from it ? You may loose them.

Best method will be to include a 'modified timestamp' in the source table populated either by the systems creating the source or as a trigger and using this value in your extraction. This way, you do not hinder the source users and ensure that all records are accessed.

This will help for future audit also. I am sure you will face this.

Solution Found!

Posted: Tue Aug 10, 2010 7:31 am
by ffillmorejr
Thank you to all who responded to this post. I have blogged on the integration of InfoSphere Change Data Capture and DataStage - including the solution to the pruning question I had posted here.

http://www.thefillmoregroup.com/blog/?p=369