Page 1 of 1

Design issue: Dimesnion Key that changes.

Posted: Wed Jan 25, 2006 11:44 am
by gotosarath
Hi all,

I have a dimension for which key column can change. I have SSN (Social Security Number) and employee_num that are key columns in employee dimension. We have found that we ended up getting duplicate records for same employee when ever there is change in either of columns.

Why change in SSN? Due to typo they might enter incorrect SSN. So when this is corrected it comes as a new record into data warehouse.
Why Change in Employee_num? When an employee quits and leave the company and rejoins after sometime he is assigned a new employee number. Also if an employee gets transferred to a new business unit, he is given a new employee_num. (Treated as termination + rejoin)

Can someone explain how to handle this? This is more a design problem than DataStage problem. TIA.

Thanks,
Sarath

Posted: Wed Jan 25, 2006 12:17 pm
by kcbland
This is not a tool issue. The industry practice is called surrogate keys, and they require a natural key for assignment. The use of a changeable value as a primary key is a fundamental mistake in data modeling. The unique columns that identify a row are called the "natural key". You need to talk to your architects, they messed up.

Posted: Wed Jan 25, 2006 2:00 pm
by kwwilliams
I would make the natural key your employee number. I can see where someone would make the business decision that a rehire would have a new employee number. This would also solve the issue of an employee fat fingering the ssn. When the correction is made it would update the ssn. Kenneth is correct if your primary key is emp no and ssn, somebody messed up.

One quesion I have for you is do you consider having an employee with two employee numbers as a duplicate? If you do you might want to search on data cleansing and survivorship.

Hope this helps,

Posted: Wed Jan 25, 2006 3:07 pm
by dnsjain
I would create a new surrogate key for the dimension and make employee number and SSN number as part of unique alternate key.

Surrogate key will help me other records in fact table to link properly with history records.