Performance using SQL query or Join/Lookup stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
eswar1
Participant
Posts: 10
Joined: Sat Jan 01, 2011 6:21 am

Performance using SQL query or Join/Lookup stage

Post by eswar1 »

Hi,
I'm having 9 source tables.I'm using DB2 database as Source and Target also.There is common key for all source tables.I have two options to design a job.
First One is take 9 tables DB2 stages and join them using join/lookup stages.
Second One is using Souce SQL Query join all tables like this
select
A.col1,A.col2,A.col3,B.COL4,B.COL5 , C.COL6,C.COL7,D.col8,D.col9 from ORG A,ORGN_NME B,FIN_ORG C,INTR_ORGN D WHERE
A.col1= B.col1 AND A.col1= C.col1 and A.col1= D.col1
I'm concern about Performance.Which'll give good performance.
Please Clarify it.

Warm Regards,
eswar
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

There are two many variables for anyone to know that answer, if you really want to compare / contrast the two methodologies build both. Me, I would assume the 'all in the database' would work out better.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If the join keys are supported by indexes, then the equi-join in the database will certainly and substantially out-perform the join in DataStage. A further benefit is that DataStage will only need to process those rows that are the result of the join, rather than all rows from all tables.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Post by SURA »

If the data volume is huge, rather than doing it in the DB level, better to use it in Join level will be fater is my understanding.

Still as chulett said, better do both the way and share the result.

DS User
Post Reply