Slowly Changin Dimension
Moderators: chulett, rschirm, roy
Slowly Changin Dimension
Hi
this is about Type 2 SCD...
i want to check the incoming records...whether it is already exist or not...if the record is a new one then insert into the database table...if the record already exit the update record should be inserted with new values and simultaneously it should update the previous record ( existing records)..
Eg...if i have Start Date and End Date....
ProductID StartDate EndDate
100 01/02/04 (will be null)(need to updated)
100 01/20/04 Null
A new value of existing record is inserted with the start date and endate as NULL, when this occurs the previous record's end date should be updated (it should be equal to the start date of the record inserted...)
Thank You
this is about Type 2 SCD...
i want to check the incoming records...whether it is already exist or not...if the record is a new one then insert into the database table...if the record already exit the update record should be inserted with new values and simultaneously it should update the previous record ( existing records)..
Eg...if i have Start Date and End Date....
ProductID StartDate EndDate
100 01/02/04 (will be null)(need to updated)
100 01/20/04 Null
A new value of existing record is inserted with the start date and endate as NULL, when this occurs the previous record's end date should be updated (it should be equal to the start date of the record inserted...)
Thank You
I did not see a question here. Are you asking how to design the ETL jobs to support such a task?
I would start first with your modeling. You forgot in your model to include a surrogate key for ProductID, maybe ProductKey would be a good name. You also have chosen to use NULL as a startdate. This means that you will never be able to use SQL BETWEEN clauses to find data. Furthermore, since you did not include a surrogate key, then everywhere you need have a this dimension as a foreign key you will have the natural key. This means every join to the dimension will have to do a dimension join using a BETWEEN statement. This is the worst performing SQL you could choose.
You should design a dimension like:
You should also use a date waaaay out in the future for the TVEndDate on the most current row. As you create the next time variant row, you will stamp the prior current row with an ending effective date, as well as remove the current indicator.
Your surrogate key will be assigned to the natural key plus the start date. This establishes uniqueness. Everywhere that this dimension is a foreign key you should embed the surrogate key. This means that your joins will be pure Star joins, without BETWEEN statements (which your current design makes difficult), and supports natural indexes. You also gain the the power that the appropriate time variant dimension row aligns to the fact, because your ETL did the heavy lifting during transformation. This further supports the effort to enforce RI during ETL, and allow you the discretion to remove RI constraints in the database.
Thoughts?
I would start first with your modeling. You forgot in your model to include a surrogate key for ProductID, maybe ProductKey would be a good name. You also have chosen to use NULL as a startdate. This means that you will never be able to use SQL BETWEEN clauses to find data. Furthermore, since you did not include a surrogate key, then everywhere you need have a this dimension as a foreign key you will have the natural key. This means every join to the dimension will have to do a dimension join using a BETWEEN statement. This is the worst performing SQL you could choose.
You should design a dimension like:
Code: Select all
ProductKey <pk>
ProductID <natural key>
...
...
...
TVStartDate <timestamp when your Time Variant row begins effectivity>
TVEndDate <timestamp when your Time Variant row ends effectivity>
TVnumber <incrementing counter of how many Time Variants exist for this natural key>
TVCurrentInd <TRUE/FALSE indicator if this row is the most current>
Your surrogate key will be assigned to the natural key plus the start date. This establishes uniqueness. Everywhere that this dimension is a foreign key you should embed the surrogate key. This means that your joins will be pure Star joins, without BETWEEN statements (which your current design makes difficult), and supports natural indexes. You also gain the the power that the appropriate time variant dimension row aligns to the fact, because your ETL did the heavy lifting during transformation. This further supports the effort to enforce RI during ETL, and allow you the discretion to remove RI constraints in the database.
Thoughts?
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Here's some posts that cover SCDs and their implementation using DataStage and alll the nasty stuff you need to think about:
viewtopic.php?t=86234
viewtopic.php?t=85542
viewtopic.php?t=85569
viewtopic.php?t=85822
viewtopic.php?t=86187
viewtopic.php?t=84998
viewtopic.php?t=86681
viewtopic.php?t=85970
viewtopic.php?t=86234
viewtopic.php?t=85542
viewtopic.php?t=85569
viewtopic.php?t=85822
viewtopic.php?t=86187
viewtopic.php?t=84998
viewtopic.php?t=86681
viewtopic.php?t=85970
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
-
- Participant
- Posts: 5
- Joined: Wed Feb 11, 2004 11:22 pm
Hi,kcbland wrote:Here's some posts that cover SCDs and their implementation using DataStage and alll the nasty stuff you need to think about:
viewtopic.php?t=86234
viewtopic.php?t=85542
viewtopic.php?t=85569
viewtopic.php?t=85822
viewtopic.php?t=86187
viewtopic.php?t=84998
viewtopic.php?t=86681
viewtopic.php?t=85970
These sites above mentioned are not anymore
thanks
vr
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I clicked on every one, and they opened a new Explorer. Perhaps you're blocking pop-ups?vijay_reddy wrote:These sites above mentioned are not anymore
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
-
- Participant
- Posts: 5
- Joined: Wed Feb 11, 2004 11:22 pm
Hi Kenneth,kcbland wrote:I clicked on every one, and they opened a new Explorer. Perhaps you're blocking pop-ups?vijay_reddy wrote:These sites above mentioned are not anymore
REally it did not opened , i tried n number of times.
Now it is opened, may be there was some problem with my net.
Thanks
vr
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I've seen something like that before.
It's documented at http://www.dilbert.com/comics/dilbert/a ... 522841.jpg
It's documented at http://www.dilbert.com/comics/dilbert/a ... 522841.jpg
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.