Page 1 of 1

how to use join stage for joinig more than two tables

Posted: Tue Aug 18, 2009 7:50 am
by agrawal.sunil
Hi,
I want to use Join Stage for joining three tables. Suppose I have 3 tables names respectively: A, B and C. Table A and B are to be joined on key column 'K1' and table B and C are joined on key column 'K2'. I have specified table B as intermediate, table A as left and table C as right, in the link ordering setting. Please help me to specify as to which key column of K1 and K2 is meant for which pair of tables that I want to join.
Pair1-> Left - Intermediate
Pair2-> Right - Intermediate

Kindly help me in this matter.
Thanks,
Sunil

Posted: Tue Aug 18, 2009 8:04 am
by aaryabhatta
I guess its best to use the User-defined SQL query in database stage, in which case database indexes will be used and will be faster compared to join stage.

Re: how to use join stage for joinig more than two tables

Posted: Tue Aug 18, 2009 8:43 am
by Klaus Schaefer
In one join stage you can join multiple links on the same keys only. In your case you have to use 2 joins: join A and B on K1 and then repartion, sort and join B and C on K2.

Klaus

Posted: Tue Aug 18, 2009 9:50 am
by dsuser_cai
Do a row count before going for a join stage. If the row count is very less use SQL query to join the tables, ot use a look up instead, go for join only if its necessary(very huge data/less space)

Re: how to use join stage for joinig more than two tables

Posted: Tue Aug 18, 2009 10:58 pm
by agrawal.sunil
Klaus Schaefer wrote:In one join stage you can join multiple links on the same keys only. In your case you have to use 2 joins: join A and B on K1 and then repartion, sort and join B and C on K2.

Klaus
Thanks Klaus :)

Posted: Tue Aug 18, 2009 11:45 pm
by anandsiva
Do use SQL which is much more faster for these kind of joins. We are using these kind of joins and we generally prefer SQLs for these operations.

Posted: Wed Aug 19, 2009 6:17 pm
by ray.wurlod
The short answer to the original question is that the Join stage supports more than two input links if required. They are called "Left", "Intermediate" and "Right" in the Input Link Ordering dialog. I believe that joins are performed pairwise, beginning from the left.

Posted: Thu Jul 29, 2010 1:12 pm
by Marley777
What is an intermediate file, what does it give you?

Posted: Thu Jul 29, 2010 3:38 pm
by ray.wurlod
More than two inputs.

In SQL it would look something like
SELECT columns FROM tableA,tableB,tableC
WHERE tableA.key = tableB.key AND tableA.key = tableC.key

tableB is the intemediate here.