Steps to take Fact load

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Phani01
Participant
Posts: 7
Joined: Mon Jun 22, 2009 8:41 am

Steps to take Fact load

Post by Phani01 »

Can anyone guide me the steps to take to load fact table.
Performance should be good and data load should be error free.

AND

Suppose i want to load 100million records and Job failed after 50 million records.Is there any way to load the rest of the rows(Remaining 50 million) instead of all from scratch.

Any help can be appreciated.
Thanks,
Phani Kumar
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

1. "Performance should be good". This topic is far too complex to discuss in a single thread. It depends upon so many different factors ranging from performance expectations (e.g. I have a window of 3 hours to load my data and it must finish within that time) to database SQL optimization with literally hundreds of other factors to put into the equation.

2. "data load should be error free". That has nothing to do with steps to take but is a matter of job design and implementation and mainly of very good planning.

3. "Suppose i want to load 100million records and Job failed after 50 million...". Yes, you can code this several different ways. How you implement this restartability is heavily dependant upon what sort of writes you are doing ("inserts" only vs. "inserts" and "updates"" to a table, the former being easier and the latter more complex to implement).
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

4. "steps to take to load fact table". This part is actually the easiest part. Stream your core data in - which is typically whatever measures the fact contains and the associated business keys - do lookups against all of the fact's dimension tables to retrieve their surrogates and then insert the fact record. Wallah! :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
Phani01
Participant
Posts: 7
Joined: Mon Jun 22, 2009 8:41 am

Post by Phani01 »

Thanks guys.

While inserting into fact table, do we drop the indexes and remove any relationships between dimensions and fact for better performance?
Thanks,
Phani Kumar
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Depends, maybe yes maybe no. Usually one would try both ways to check.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply