how data loads in db2 stage after restarting

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ravi7487
Premium Member
Premium Member
Posts: 25
Joined: Sat Feb 05, 2011 10:03 pm

how data loads in db2 stage after restarting

Post by ravi7487 »

Hi,

I have a job which writes 1000 records to DB2 Database. my question is, after writing 500 records, job aborts due to connectivity issues. how will i make sure that after restrating, it will write from 501 record?

Dataset ------>DB2

please expain me this? what are the options i have to use in the DB2 stage?
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

This is very difficult to do in a parallel job due to partitioning and multiple transformers and DB2 insert stages within the job. It is difficult to work out how many rows were loaded to target and how to skip those rows on a restart.

The only fully automated way to do this is to use the new CDD Transaction stage for DataStage. It requires the purchase of a "CDD" license with is the InfoSphere CDC tool (formerly known as Data Mirror) under a special DataStage license that allows unlimited CDC sources. CDC is a database replication tool that can deliver delta data from database logs, it can guarantee delivery of data through a DataStage job by bookmarking the target table and keeping track of rows delivered to target and ensuring missing rows get delivered after a restart.

If you do not have a CDD/CDC license there is a bit of code to handle fail over and restart. You can design a job that checks to see if the data is already in the target table, you can automate the removal of previous inserts and restart from the beginning.
ravi7487
Premium Member
Premium Member
Posts: 25
Joined: Sat Feb 05, 2011 10:03 pm

Post by ravi7487 »

Thanks for the reply.

what is the usual process when the job aborts after writing few records? do we manually delete them and start the job again?
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

That is the safest path and results in the simplest ETL job design.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Of course, it might be a server job, since the question was posted in the Server forum...
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

A Server job that sources from a dataset? I think not... hold on tight, off to PX land we go!
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That's a challenge, but one that I think has a solution.

(But it's probably posted in the incorrect forum.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply