Join at Extract or later

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
DSDeveloper
Premium Member
Premium Member
Posts: 1
Joined: Tue Jun 12, 2007 10:22 am

Join at Extract or later

Post by DSDeveloper »

Hello,

I am a newbie in DSS and BI world, have been reading various DW books to improve.

I have a general question regarding the dimensions. OLTP system has a highly normalized structure while DW structures are denormalized.

Using DataStage - When we pull the information from the source system (for the dimensions), the information comes from multiple tables.

Which way is better
1) While pulling the information join the multiple tables and put the information in staging area(RIGHT AT THE SOURCE)
Or
2) Bring the information separately(multiple tables) and then after the information is in staging area then do the joins.

Thanks a million for your time.

Regards
Eric
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard.

Where possible, join at extract. There are two main reasons.

First, the join may be able to be assisted by indexes and so be achieved faster.

Second, the result of the join will contain fewer rows than the unjoined contents of the source tables; this is even more important if these rows must be transmitted over a network. Fewer rows will be processed faster.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply