Design issue: Dimesnion Key that changes.

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
gotosarath
Participant
Posts: 13
Joined: Mon Sep 13, 2004 5:01 am

Design issue: Dimesnion Key that changes.

Post by gotosarath »

Hi all,

I have a dimension for which key column can change. I have SSN (Social Security Number) and employee_num that are key columns in employee dimension. We have found that we ended up getting duplicate records for same employee when ever there is change in either of columns.

Why change in SSN? Due to typo they might enter incorrect SSN. So when this is corrected it comes as a new record into data warehouse.
Why Change in Employee_num? When an employee quits and leave the company and rejoins after sometime he is assigned a new employee number. Also if an employee gets transferred to a new business unit, he is given a new employee_num. (Treated as termination + rejoin)

Can someone explain how to handle this? This is more a design problem than DataStage problem. TIA.

Thanks,
Sarath
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

This is not a tool issue. The industry practice is called surrogate keys, and they require a natural key for assignment. The use of a changeable value as a primary key is a fundamental mistake in data modeling. The unique columns that identify a row are called the "natural key". You need to talk to your architects, they messed up.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
kwwilliams
Participant
Posts: 437
Joined: Fri Oct 21, 2005 10:00 pm

Post by kwwilliams »

I would make the natural key your employee number. I can see where someone would make the business decision that a rehire would have a new employee number. This would also solve the issue of an employee fat fingering the ssn. When the correction is made it would update the ssn. Kenneth is correct if your primary key is emp no and ssn, somebody messed up.

One quesion I have for you is do you consider having an employee with two employee numbers as a duplicate? If you do you might want to search on data cleansing and survivorship.

Hope this helps,
dnsjain
Charter Member
Charter Member
Posts: 34
Joined: Thu May 08, 2003 2:12 pm

Post by dnsjain »

I would create a new surrogate key for the dimension and make employee number and SSN number as part of unique alternate key.

Surrogate key will help me other records in fact table to link properly with history records.
Post Reply