dwh-setup using compare and difference-stages
Posted: Wed Sep 22, 2004 6:27 am
I'm setting up a new DWH and now we're working on the templates for filling dimensions. What we now have is that everything is coming in daily(no delta's but complete tables), and we transform this into the same format as the dimension table.
At this point we need to do 3 actions
-check for inserts: all records in input and not in dwh
-check for updates : all records in input for which we find the key in dwh, and for which relevant columns have changed
-check for deletes:records in dwh that are not in the inputs anymore.
When going trough the documentation of DS7.5 I've found 2 stagetypes which seem to be ideal for this, but with which I'm not familiar: the difference-stage and the compare stage.
Would it be a wise setup to use the following as template:
-create 2 ordered sets:one for the input, one for the current status of the dimension.
-use a difference with input as after and dwh as before : this will give the inserts
-use a compare :this will give us the updates
-use a difference with dwh as after and input as before: this will give us the deletes
main advantage I see is that this can easily be reused, main disadvantage seems the large amount of data we'll need to process.
Any one else any idea's?
At this point we need to do 3 actions
-check for inserts: all records in input and not in dwh
-check for updates : all records in input for which we find the key in dwh, and for which relevant columns have changed
-check for deletes:records in dwh that are not in the inputs anymore.
When going trough the documentation of DS7.5 I've found 2 stagetypes which seem to be ideal for this, but with which I'm not familiar: the difference-stage and the compare stage.
Would it be a wise setup to use the following as template:
-create 2 ordered sets:one for the input, one for the current status of the dimension.
-use a difference with input as after and dwh as before : this will give the inserts
-use a compare :this will give us the updates
-use a difference with dwh as after and input as before: this will give us the deletes
main advantage I see is that this can easily be reused, main disadvantage seems the large amount of data we'll need to process.
Any one else any idea's?