Primary Key for Staging Tables

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
nkln@you
Premium Member
Premium Member
Posts: 271
Joined: Wed Nov 17, 2004 5:15 am
Location: US

Primary Key for Staging Tables

Post by nkln@you »

I think that this Question is not directly related to this forum. But out of curiosity I am posting this Question.

Do Staging tables in Data Warehouse/DataMarts need Primary Key?

I feel primary keys are not needed for Staging Table as if they are any duplicates, they get rejected in Fact Table and Dimension Table.

but one of my colleagues says that primary keys are must for tables in Data Warehouse ( he even says primary key is must for all tables in a Data Warehouse)

Any inputs on this
Aim high
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

It really depends upon what you are using the staging tables for. It also depends upon where the staging layer comes in. Right in the begining of the cycle before transformations or after transformations and right before loading the target.
If your source has duplicates and you need to retain them, then dont create a primary key. Or just create a surrogate key.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I primarily use text files (or Data Sets in parallel jobs) for staging because they are FAST.

What does your "must have primary key" colleague think of that ?!!

I agree that primary key is a must for all tables in a Data Warehouse; this is one of the fundamental principles of star schema design. But not necessarily for staging areas.

Incidentally, the metadata forum probably would have been appropriate; whether or not a column is a Key is metadata.
Last edited by ray.wurlod on Tue Jun 12, 2007 5:12 am, edited 2 times in total.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ajay.prakash03
Participant
Posts: 54
Joined: Mon May 28, 2007 12:41 am
Location: Chennai

Post by ajay.prakash03 »

Yes, you have to have a primary key in all the tables in a Datawarehouse. In practice we replace the natural primary key with the surrogate key. Otherwise you can't load data into dimensions and fact tables.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The question is "Do staging tables in DW/DM need a Primary Key"? And the short answer is no. They don't fall under the general rule of being "a must for tables in a DW" as to me the staging area / back room "tables" are separate structures outside of the core DW itself.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Smeitei
Participant
Posts: 28
Joined: Tue Jan 23, 2007 3:14 pm

Post by Smeitei »

NO. There is no hard and fast rule that a primary Key should be defined in Staging Table
Post Reply