Why we preffere Surrogate key than the primary key

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
fareeda_b
Participant
Posts: 48
Joined: Sat Feb 23, 2008 4:25 pm

Why we preffere Surrogate key than the primary key

Post by fareeda_b »

we are maintaining the history in type2. for primary key value we are using surrogate key.

but in one interview he asked why u preferred surrogate key than the primary key and tellme the what is the difference b/w Surrogate key and primary key
Thanks
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

And what was your reply?

Have you searched here, Google, Wikipedia and/or The Data Warehousing Institute (to name a few) for the rationale of using surrogate key in Type 2 SCD?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
hamzaqk
Participant
Posts: 249
Joined: Tue Apr 17, 2007 5:50 am
Location: islamabad

Post by hamzaqk »

both enforce uniqueness to the record although SK is generated for only newly inserted records in the database based on some key columns.

In type 2 we simply insert the a new record based on a history handling column rather than updating it. Space would be an issue in type 2 as we are inserting multiple records for the same person for example.

Correct me if i am wrong?
tbharathkumar
Participant
Posts: 26
Joined: Mon Aug 27, 2007 6:27 am
Location: Des Moines

Re: Why we preffere Surrogate key than the primary key

Post by tbharathkumar »

Hi,

sorrogate key used to preserve referal integrity, tht's the answer!
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

But it's the wrong answer.

A surrogate key preserves uniqueness, nothing more.

When you need to preserve history, such as to be able to track a person's name or address changes, then the primary key of PersonID alone won't do it. You'd need a primary key made up of PersonID and some way of identifying which is the current record, such as effective date and maybe an "is active" flag. Already the key structure is cumbersome. Instead, load this information into (possibly indexed) non-key columns and assign a surrogate, or artificial key. Integer is usually chosen as the data type, because these join very easily. However, SQL Server offers a GUID data type with similar characteristics, for example.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
AjayD
Participant
Posts: 12
Joined: Thu Aug 17, 2006 12:34 am

Post by AjayD »

Hi,

Surrogate key is a substitution for the natural primary key. It is just a
unique identifier or number for each row that can be used for the
primary key to the table. The only requirement for a surrogate primary
key is that it is unique for each row in the table. It is useful because
the natural primary key can change and this makes updates more
difficult.Surrogated keys are always integer or numeric.

Thx
AjayD
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Didn't realize there was a Google contest going on. :wink:
Surrogate key is a substitution for the natural primary key. It is just a unique identifier or number for each row that can be used for the primary key to the table. The only requirement for a surrogate primary key is that it is unique for each row in the table.
http://datawarehouse.ittoolbox.com/docu ... e-key-1501#

Interesting how many 'interview answer sites' have this exact same phrase ( including substitution rather than substitute) indicating massive copying from the ITToolBox source.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply