Page 1 of 1

Problem in joining 2 tables

Posted: Thu Jun 15, 2006 12:44 am
by Amarpreet
I have 2 tables to join, in which the column names are not same and the condition is not simple. I have a condition like this:
where table1.a>table2.b and table1.a<=table2.c(in between condition). How can I do this in parallel job.

Posted: Thu Jun 15, 2006 1:03 am
by nick.bond
Are the two tables in the same database? If so join in the database first.

If they are not it is more difficult. Are those the only two key fields or are there more key fields that you could join on but which would return multiple records for the same keys? If so you could join on the other keys passing through table1.a, table2.b and table3.c, which will give you X-product and then use a filter stage to restrict output to the records where your condition is true.

Posted: Thu Jun 15, 2006 2:16 am
by ray.wurlod
Rename the columns in an upstream Copy or Modify stage so that the key column names do match.

Posted: Thu Jun 15, 2006 8:42 am
by seanc217
You could also use column aliases to rename the column utilizing custom sql. In an Oracle DB you would do it as such select col1 as column1....

HTH

Re: Problem in joining 2 tables

Posted: Thu Jun 15, 2006 9:04 am
by sud
Amarpreet wrote:table1.a>table2.b and table1.a<=table2.c(in between condition)
You should not need a join condition at all and hence you won't need renaming columns. Achieve a cross join of the two tables and put a filter where you can specify the condition you mentioned.

Posted: Thu Jun 15, 2006 9:15 am
by kumar_s
Or do a lookup to the table2 for a dummy column with a value say '1'. You can check the lookup condition as give by you table1.a>table2.b and table1.a<=table2.c.

Posted: Thu Jun 15, 2006 9:24 am
by sud
kumar_s wrote:Or do a lookup to the table2 for a dummy column with a value say '1'. You can check the lookup condition as give by you table1.a>table2.b and table1.a<=table2.c.
In this case if there is a possibility of more than one record satisfying the condition, please do not forget to set the link name from which multiple records can come (u can set this property by double clicking the lookup condition)

Posted: Thu Jun 15, 2006 9:29 am
by kumar_s
Which is very similar ot Cartesian product by using Join stage.

Posted: Thu Jun 15, 2006 4:12 pm
by ray.wurlod
The Cartesian product "solution" proposed by a couple of posters will be very long-running if both tables have large row counts.

Posted: Fri Jun 16, 2006 3:06 am
by nick.bond
If the tables are not in the same database load them both into temp staging tables in the same database and perform join in there.

Posted: Fri Jun 16, 2006 6:17 am
by kumar_s
bondcaliwood wrote:If the tables are not in the same database load them both into temp staging tables in the same database and perform join in there.
Still it will run for long if row are large in numbers.

Posted: Thu Apr 17, 2008 9:15 am
by teapczynski
sud wrote:
kumar_s wrote:Or do a lookup to the table2 for a dummy column with a value say '1'. You can check the lookup condition as give by you table1.a>table2.b and table1.a<=table2.c.
In this case if there is a possibility of more than one record satisfying the condition, please do not forget to set the link name from which multiple records can come (u can set this property by double clicking the lookup condition)
How exactly would you specify the lookup condition (I am new to DataStage). I need to generate a undetermined number of output rows per input. This thread seems to get me the closest but I am having issues reducing the output to the rows I am looking for.

Posted: Thu Apr 17, 2008 9:36 am
by sud
Well, you can check out the lookup stage and use dummy columns in both primary and secondary lookup inputs to achieve a cross join and set lookup condition in the lookup stage.

But, what do you mean by "generate a undetermined number of output rows per input" :?: