Does SCD Apply to Compound Key Slow Change Dimension Table?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
olgc
Participant
Posts: 145
Joined: Tue Nov 18, 2003 9:00 am

Does SCD Apply to Compound Key Slow Change Dimension Table?

Post by olgc »

Hi there,

Below is a type 2 slow change dimension table:

CREATE TABLE INDIVIDUAL_NAME (
PERSON_PARTY_ID INTEGER NOT NULL,
INDIVIDUAL_NAME_START_DT DATE NOT NULL,
GIVEN_NAME VARCHAR(40) NOT NULL,
PREFERRED_NAME VARCHAR(40),
MIDDLE_NAME VARCHAR(40),
FAMILY_NAME VARCHAR(40),
NAME_PREFIX_TXT VARCHAR(10),
NAME_SUFFIX_TXT VARCHAR(10),
INDIVIDUAL_NAME_END_DT DATE NOT NULL,
CONSTRAINT PK_INDIVIDUAL_NAME PRIMARY KEY(INDIVIDUAL_NAME_START_DT,PERSON_PARTY_ID)
)

Can this table be handled by SCD stage? We know type 2 table uses a surrogate key as the primary key and also to tack the history. But this table uses a person ID (even it's a surrogate key generated based on source person ID) and a start date to track history.

Thanks,
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Yes. The fact that multiple columns are marked as Key in the metadata drives this behaviour correctly.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
olgc
Participant
Posts: 145
Joined: Tue Nov 18, 2003 9:00 am

Post by olgc »

Thanks ray.wurlod

So it can be marked as below:

PERSON_PARTY_ID Business Key?
INDIVIDUAL_NAME_START_DT Effective date
GIVEN_NAME type 2
PREFERRED_NAME type 2
MIDDLE_NAME type 2
FAMILY_NAME type 2
NAME_PREFIX_TXT type 2
NAME_SUFFIX_TXT type 2
INDIVIDUAL_NAME_END_DT Expiration date

Also (INDIVIDUAL_NAME_START_DT,PERSON_PARTY_ID) is specified as primary key. Does this work?
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

I don't think it will work. The SCD stage lets you choose to make a column a Surrogate Key field or a Business Key field but in this case PERSON_PARTY_ID is both. I think you have a bad table design - your compound key makes your Fact table too big, using a source system key as part of the surrogate key is flawed and performance will be impacted.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The compound should be the business key, though a more efficient fact table would be constructed by amalgamating these into a single column, perhaps with delimiter characters. The Surrogate Key must be an integer of some kind, usually int64 (or even uint64) to allow for growth.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
olgc
Participant
Posts: 145
Joined: Tue Nov 18, 2003 9:00 am

Post by olgc »

Some of us knew it's a bad data model, but it's built for years and built by paying big bucks by "expert" consultant. That's not the all yet. There are others even worse.

Here I just care whether SCD stage apply to this data model or not and try to understand SCD stage: what model it can handle, what doesn't. Not the model is good or bad.

Thanks,
olgc
Participant
Posts: 145
Joined: Tue Nov 18, 2003 9:00 am

Post by olgc »

ray.wurlod wrote:The compound should be the business key, though a more efficient fact table would be constructed by amalgamating these into a single column, perhaps with delimiter characters. The Surrogate Key must ...
In the fact table, only person_id is there. No start date is needed. So this is not a issue at all.

Thanks,
olgc
Participant
Posts: 145
Joined: Tue Nov 18, 2003 9:00 am

Post by olgc »

Does anyone know that in order for SCD stage to work, both surrogate key and business key have to be in the target table? And you have to use surrogate key to track history for type 2. You can not use others, such as, start date to track history. The SCD stage can not handle tracking history by using start date as a part of the compound key.

Thanks,
olgc
Participant
Posts: 145
Joined: Tue Nov 18, 2003 9:00 am

Post by olgc »

Does anyone know that in order for SCD stage to work, both surrogate key and business key have to be in the target table? And you have to use surrogate key to track history for type 2. You can not use others, such as, start date to track history. The SCD stage can not handle tracking history by using start date as a part of the compound key. That's my understanding for the SCD stage for now. I will try to get confirmation from IBM.

Thanks,
Last edited by olgc on Thu Feb 28, 2013 4:05 pm, edited 1 time in total.
Post Reply