Join multiple DB2 tables

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
bpaje
Participant
Posts: 2
Joined: Tue May 08, 2007 9:23 pm

Join multiple DB2 tables

Post by bpaje »

Hi to all the Gurus and helpful memeber of this forum,

I am new to DataStage and I would just like to ask for your opinion regarding one of my current project wherein I will need to Join 6 tables and output it into a single table:

Table A = 50,000 records
Table B = 300,000 records
Table c = 300,000 records
Table D = 3,000,000 records
Table E = 100,000 records
Table F = 5,000,000 records

with the following conditions:

A.id = B.id
B.name = C.name
C.code = D.code
D.add = E.add
E.state = F.state
F.version = 'A'

my Target table will contain the columns
id, name, code, add, state & version

All Source Tables and Target Table are in DB2.

Can I use one DB2 stage and join all tables using a SQL statement then link it to my Target Table or Is there a better and more efficient way of doing this?

Thanks in advance!
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

In the database is usually better, because the join can leverage any indexes that may exist on the join keys.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rafik2k
Participant
Posts: 182
Joined: Wed Nov 23, 2005 1:36 am
Location: Sydney

Post by rafik2k »

It all depends on ur requirement.

For example
Table A = 50,000 records
Table B = 300,000 records

If u have one to many or many to one or both relationship on above two table the the result u will get a cartesian product from both which may differ from your expected result set, when u fire the query at database level.

If the requirement is to get all records from Table A and their matching records only from rest of the table, then u have treat Table A as driving table and rest of the table is lookup/hash table.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

U isn't performing the query. bpaje is.
:roll:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
bpaje
Participant
Posts: 2
Joined: Tue May 08, 2007 9:23 pm

Post by bpaje »

Ok thanks ray.wurlod & rafik2k for your inputs.
rafik2k
Participant
Posts: 182
Joined: Wed Nov 23, 2005 1:36 am
Location: Sydney

Post by rafik2k »

ray.wurlod wrote:U isn't performing the query. bpaje is.
:roll:
Ray, ofcourse It's not me..
But I had similar kind of issue while joining multiple source table.
uccio2002
Participant
Posts: 18
Joined: Tue Mar 02, 2004 3:38 am

Re: Join multiple DB2 tables

Post by uccio2002 »

due to big numer of record on f table peraps it is etter to do not have only a step with multiple join, but have different job (two it i great i suppose) where the second one read the output of first one and make the other join using temporary table. off course avoid to use look up instead of join , them are so slow
baglasumit21
Participant
Posts: 132
Joined: Wed Mar 01, 2006 11:12 pm
Location: Pune

Re: Join multiple DB2 tables

Post by baglasumit21 »

What i think is Using hash files and look ups would be a better option as it would be a bit faster than the joins
SMB
Post Reply