Page 1 of 1

Does SCD Apply to Compound Key Slow Change Dimension Table?

Posted: Tue Feb 26, 2013 2:43 pm
by olgc
Hi there,

Below is a type 2 slow change dimension table:

CREATE TABLE INDIVIDUAL_NAME (
PERSON_PARTY_ID INTEGER NOT NULL,
INDIVIDUAL_NAME_START_DT DATE NOT NULL,
GIVEN_NAME VARCHAR(40) NOT NULL,
PREFERRED_NAME VARCHAR(40),
MIDDLE_NAME VARCHAR(40),
FAMILY_NAME VARCHAR(40),
NAME_PREFIX_TXT VARCHAR(10),
NAME_SUFFIX_TXT VARCHAR(10),
INDIVIDUAL_NAME_END_DT DATE NOT NULL,
CONSTRAINT PK_INDIVIDUAL_NAME PRIMARY KEY(INDIVIDUAL_NAME_START_DT,PERSON_PARTY_ID)
)

Can this table be handled by SCD stage? We know type 2 table uses a surrogate key as the primary key and also to tack the history. But this table uses a person ID (even it's a surrogate key generated based on source person ID) and a start date to track history.

Thanks,

Posted: Tue Feb 26, 2013 3:12 pm
by ray.wurlod
Yes. The fact that multiple columns are marked as Key in the metadata drives this behaviour correctly.

Posted: Tue Feb 26, 2013 3:28 pm
by olgc
Thanks ray.wurlod

So it can be marked as below:

PERSON_PARTY_ID Business Key?
INDIVIDUAL_NAME_START_DT Effective date
GIVEN_NAME type 2
PREFERRED_NAME type 2
MIDDLE_NAME type 2
FAMILY_NAME type 2
NAME_PREFIX_TXT type 2
NAME_SUFFIX_TXT type 2
INDIVIDUAL_NAME_END_DT Expiration date

Also (INDIVIDUAL_NAME_START_DT,PERSON_PARTY_ID) is specified as primary key. Does this work?

Posted: Tue Feb 26, 2013 4:46 pm
by vmcburney
I don't think it will work. The SCD stage lets you choose to make a column a Surrogate Key field or a Business Key field but in this case PERSON_PARTY_ID is both. I think you have a bad table design - your compound key makes your Fact table too big, using a source system key as part of the surrogate key is flawed and performance will be impacted.

Posted: Tue Feb 26, 2013 7:42 pm
by ray.wurlod
The compound should be the business key, though a more efficient fact table would be constructed by amalgamating these into a single column, perhaps with delimiter characters. The Surrogate Key must be an integer of some kind, usually int64 (or even uint64) to allow for growth.

Posted: Wed Feb 27, 2013 8:55 am
by olgc
Some of us knew it's a bad data model, but it's built for years and built by paying big bucks by "expert" consultant. That's not the all yet. There are others even worse.

Here I just care whether SCD stage apply to this data model or not and try to understand SCD stage: what model it can handle, what doesn't. Not the model is good or bad.

Thanks,

Posted: Wed Feb 27, 2013 10:19 am
by olgc
ray.wurlod wrote:The compound should be the business key, though a more efficient fact table would be constructed by amalgamating these into a single column, perhaps with delimiter characters. The Surrogate Key must ...
In the fact table, only person_id is there. No start date is needed. So this is not a issue at all.

Thanks,

Posted: Thu Feb 28, 2013 4:02 pm
by olgc
Does anyone know that in order for SCD stage to work, both surrogate key and business key have to be in the target table? And you have to use surrogate key to track history for type 2. You can not use others, such as, start date to track history. The SCD stage can not handle tracking history by using start date as a part of the compound key.

Thanks,

Posted: Thu Feb 28, 2013 4:03 pm
by olgc
Does anyone know that in order for SCD stage to work, both surrogate key and business key have to be in the target table? And you have to use surrogate key to track history for type 2. You can not use others, such as, start date to track history. The SCD stage can not handle tracking history by using start date as a part of the compound key. That's my understanding for the SCD stage for now. I will try to get confirmation from IBM.

Thanks,