Hi,
I have a job which writes 1000 records to DB2 Database. my question is, after writing 500 records, job aborts due to connectivity issues. how will i make sure that after restrating, it will write from 501 record?
Dataset ------>DB2
please expain me this? what are the options i have to use in the DB2 stage?
how data loads in db2 stage after restarting
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
This is very difficult to do in a parallel job due to partitioning and multiple transformers and DB2 insert stages within the job. It is difficult to work out how many rows were loaded to target and how to skip those rows on a restart.
The only fully automated way to do this is to use the new CDD Transaction stage for DataStage. It requires the purchase of a "CDD" license with is the InfoSphere CDC tool (formerly known as Data Mirror) under a special DataStage license that allows unlimited CDC sources. CDC is a database replication tool that can deliver delta data from database logs, it can guarantee delivery of data through a DataStage job by bookmarking the target table and keeping track of rows delivered to target and ensuring missing rows get delivered after a restart.
If you do not have a CDD/CDC license there is a bit of code to handle fail over and restart. You can design a job that checks to see if the data is already in the target table, you can automate the removal of previous inserts and restart from the beginning.
The only fully automated way to do this is to use the new CDD Transaction stage for DataStage. It requires the purchase of a "CDD" license with is the InfoSphere CDC tool (formerly known as Data Mirror) under a special DataStage license that allows unlimited CDC sources. CDC is a database replication tool that can deliver delta data from database logs, it can guarantee delivery of data through a DataStage job by bookmarking the target table and keeping track of rows delivered to target and ensuring missing rows get delivered after a restart.
If you do not have a CDD/CDC license there is a bit of code to handle fail over and restart. You can design a job that checks to see if the data is already in the target table, you can automate the removal of previous inserts and restart from the beginning.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
That is the safest path and results in the simplest ETL job design.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: