Oracle Enterprise Stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
srekant
Premium Member
Premium Member
Posts: 85
Joined: Wed Jan 19, 2005 6:52 am
Location: Detroit

Oracle Enterprise Stage

Post by srekant »

HI,

I have to select data from Fact and dimension table which appraoch is better.Is it Better to use a user defined sql (having decode,sum,groupBy,Orderby )having a join of Fact table and Dimension table which runs in to GB or do a simple select on the fact table and compare with dimesions as lookup

I am working on DS EE 7.5 and Oracle server is at a different location.

Any Inputs are appreciated.
Sree
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

You are using enterprise edition so you don't want this SQL select to become a bottleneck, ie. you don't want a million dollars worth of ETL hardware twiddling it's thumbs while your Oracle source spends hours trying to run a large and complex SQL. So one objective is to use SQL that comes back quickly.

I prefer to keep transformation functions within DataStage and leave the SQL select list as clean as possible, avoiding column renames and functions such as decode. This helps with data lineage and metadata reporting.

If you have a large number of columns to retrieve you may find the job easier to build and maintain using generated SQL or the SQL Query Builder as it keeps the stage column list synchronised with the query column list.

The other thing I try to do in SQL stages is to reduce the number of rows brought to the ETL server by putting the filters into the WHERE clause. This may mean joining tables that are important to the filter commands.
Post Reply