how data loads in db2 stage after restarting

ravi7487 · Post by **ravi7487** » Wed Feb 09, 2011 3:13 pm

Hi,

I have a job which writes 1000 records to DB2 Database. my question is, after writing 500 records, job aborts due to connectivity issues. how will i make sure that after restrating, it will write from 501 record?

Dataset ------>DB2

please expain me this? what are the options i have to use in the DB2 stage?

vmcburney · Post by **vmcburney** » Wed Feb 09, 2011 4:04 pm

This is very difficult to do in a parallel job due to partitioning and multiple transformers and DB2 insert stages within the job. It is difficult to work out how many rows were loaded to target and how to skip those rows on a restart.

The only fully automated way to do this is to use the new CDD Transaction stage for DataStage. It requires the purchase of a "CDD" license with is the InfoSphere CDC tool (formerly known as Data Mirror) under a special DataStage license that allows unlimited CDC sources. CDC is a database replication tool that can deliver delta data from database logs, it can guarantee delivery of data through a DataStage job by bookmarking the target table and keeping track of rows delivered to target and ensuring missing rows get delivered after a restart.

If you do not have a CDD/CDC license there is a bit of code to handle fail over and restart. You can design a job that checks to see if the data is already in the target table, you can automate the removal of previous inserts and restart from the beginning.

ravi7487 · Post by **ravi7487** » Wed Feb 09, 2011 4:08 pm

Thanks for the reply.

what is the usual process when the job aborts after writing few records? do we manually delete them and start the job again?

vmcburney · Post by **vmcburney** » Wed Feb 09, 2011 5:43 pm

That is the safest path and results in the simplest ETL job design.

ray.wurlod · Post by **ray.wurlod** » Wed Feb 09, 2011 10:31 pm

Of course, it might be a server job, since the question was posted in the Server forum...

chulett · Post by **chulett** » Wed Feb 09, 2011 10:32 pm

A Server job that sources from a dataset? I think not... hold on tight, off to PX land we go!

ray.wurlod · Post by **ray.wurlod** » Wed Feb 09, 2011 10:37 pm

That's a challenge, but one that I think has a solution.

(But it's probably posted in the incorrect forum.)