Exact use of staging tables?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ravireddy25
Participant
Posts: 59
Joined: Wed Dec 31, 2008 5:49 am

Exact use of staging tables?

Post by ravireddy25 »

Exact use of staging tables?
Ravi
arvind_ds
Participant
Posts: 428
Joined: Thu Aug 16, 2007 11:38 pm
Location: Manali

Post by arvind_ds »

Staging tables are used to store extracted data temporarily to avoid dependency on source systems. Once data is loaded into staging tables, it can further be loaded into desired dimension/detail/transaction/fact tables.
Its a kind of intermediate storage.
Arvind
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

While I normally avoid crap sites like 'Geek Interview' there's a decent answer out there on it here. Typical nonsense here. In both cases the comments are fairly useless. Google up your own answers for more fun. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Exact use of staging tables? Staging.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

That made me L-O-L. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
dsex100
Premium Member
Premium Member
Posts: 45
Joined: Wed Apr 01, 2009 9:41 am

Post by dsex100 »

They're also good to make sure that all your data has been loaded into the target data warehouse area (i.e. the "staging" area).

In the case where for example, not all your source data files where available then you could wait until all the staging jobs completed, at this point you'd know all your data is ready to be loaded into the data warehouse.

Then you can load your work tables using some transformations, the whole process happening in it's own separate environment.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I prefer Data Sets for that (with parallel jobs). They are more efficiently loaded and unloaded than database tables.

:idea: Things like referential integrity should have been checked earlier in the ETL process - you don't need a "staging database" to do that.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply