Hi Gurus,
In the incremental loading, we consider loading records b/w the last successful load and current record.
Actually i have never done incremental loading.How do we know the last successful load.
Can Gurus plz help me in analyzing.
cheers;
Vijay
incremental loading
Moderators: chulett, rschirm, roy
Create a table and manage "Process Metadata" using it. Store information about every run, use that metadata to feed subsequent processes. That metadata could be configuration information, such as a "From Date" column.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Usually two timestamps are stored in a control table. These timestamps are passed as a job parameter to the sql select (select * from table where date between #date1# AND #date2#). Once your cycle finishes OR before your next run, the control table is updated/inserted with new date records and selected again to be passed as parameters. This will work if the records you are selecting have a date/timestamp column.
The other way is to do a lookup against table and new records go as incremental inserts.
The other way is to do a lookup against table and new records go as incremental inserts.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
-
- Premium Member
- Posts: 252
- Joined: Mon Sep 19, 2005 10:28 pm
- Location: Melbourne, Australia
- Contact:
IMO this is better done at the extraction phase, not at load time.
For delta-extraction, you can use one of the methods described above, or if extracting from an Oracle database you can look into Change Data Capture - it's a non-invasive way to capture all the changes on a table without timestamp columns, full table scans, or - with 10g - even accessing the source table itself (deltas are scraped straight from Archive Logs).
For delta-extraction, you can use one of the methods described above, or if extracting from an Oracle database you can look into Change Data Capture - it's a non-invasive way to capture all the changes on a table without timestamp columns, full table scans, or - with 10g - even accessing the source table itself (deltas are scraped straight from Archive Logs).
Ross Leishman
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
I did a blog post on this a long time ago. I think the IBM Oracle Change Data Capture and Replication Server both add triggers to the Oracle tables. You might have to go for an Oracle product to get log scraping. Even then you still need to do change data determination where, at the end of transformation, you compare your delta data to your target table to see if there really are any changes and split those into inserts and updates for optimised database loads.
So some type of delta off the database and change data capture in DataStage works well.
You can build a change data capture capability in DataStage using the CRC32 function, search the archives for examples. It is less messy than trying to compare dozens of columns in a Transformer stage. In parallel jobs you have the change capture and change apply stages to do this.
So some type of delta off the database and change data capture in DataStage works well.
You can build a change data capture capability in DataStage using the CRC32 function, search the archives for examples. It is less messy than trying to compare dozens of columns in a Transformer stage. In parallel jobs you have the change capture and change apply stages to do this.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn