Incremental Load with Multiple Sources

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Havoc
Participant
Posts: 110
Joined: Fri Nov 24, 2006 8:26 am

Incremental Load with Multiple Sources

Post by Havoc »

This is more of a solution question than a DataStage question. We have a requirement as follows:

There are two or more source tables each having different load frequencies(meaning these tables might get refreshed on different dates). Now, these two sources load the same target table.

The approach used when loading this target table is Incremental Load.

Is there a way we can load this target table ensuring that there is no data loss because data from one source might get loaded on one day while the next one is not. Is there a method to go about doing this?

There are no timestamps in the target table.

Thanks in advance :)
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The best approach would be to write out a detailed specification of what needs to happen under various scenarios, and to design accordingly.

It would be really useful if the target included some way to recognize when it was most recently updated - either a timestamp or a unique run ID. Even though you state that it doesn't, your specification may be used to prove that it is necessary. ALTER TABLE can add a new column.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply