Hello there
I'm trying to compare a new snapshot table (Oracle) with the previous, geta delta and then apply the delta to a destination table
The size of the snapshot(s) is quite big, 2-3 mil rows.
Which approach do you recomand? using datasets, Outer Joins?...
Thanks for your assistence
D
Comparing a new db snapshot to the previous and get a delta
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Welcome aboard.
Many approaches are available. First question is how accessible the previous snapshot is. It's probable that Oracle has a comparison tool (I don't know).
If you want to do it within DataStage, probably the best approach is to use one of the change detection stage types, namely Difference, Compare or Change Capture. These are all similar, but subtly different. It depends what you want as the result which you choose.
Many approaches are available. First question is how accessible the previous snapshot is. It's probable that Oracle has a comparison tool (I don't know).
If you want to do it within DataStage, probably the best approach is to use one of the change detection stage types, namely Difference, Compare or Change Capture. These are all similar, but subtly different. It depends what you want as the result which you choose.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thanks for your reply Ray
I want to do it in DataStage, as a proof of concept.
Both snapshots are in the same database. Their size is bigger than I thought at the beginning, they have arround 20 mil rows each.
By comparing the earlier snapshot to a latter snapshot we can get the static set and the changed set. The set of differences can be split into 'old rows no longer in the base' (delta1) and 'rows added to the base' (delta2)
Bases should be compared on the defined PK of the target table.
I tried the ChangeCapture stage...question is: can I have the two input links straight from the two transformers? Or should they go through an intermediary stage? Do they need to be sorted before?
I marked as keys the fields that are making the PK on the target table leaving ALL the other fields as values. Not sure if this is correct...
Thanks for your help
I want to do it in DataStage, as a proof of concept.
Both snapshots are in the same database. Their size is bigger than I thought at the beginning, they have arround 20 mil rows each.
By comparing the earlier snapshot to a latter snapshot we can get the static set and the changed set. The set of differences can be split into 'old rows no longer in the base' (delta1) and 'rows added to the base' (delta2)
Bases should be compared on the defined PK of the target table.
I tried the ChangeCapture stage...question is: can I have the two input links straight from the two transformers? Or should they go through an intermediary stage? Do they need to be sorted before?
I marked as keys the fields that are making the PK on the target table leaving ALL the other fields as values. Not sure if this is correct...
Thanks for your help