Hi Gurus,
How could I perform cartesian join in Datastage PX? My requirement is to get the cartesian join from multiple input files.
I have tried to use Join Stage to do it, but it seems that for Full Outer Join it allows only 2 input links. And I could not perform cartesian join in Database either.
If you have any idea how to perform cartesian join in PX, please suggest.
Thanks
Cartesian Join in PX
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 22
- Joined: Wed Sep 17, 2003 12:21 pm
- Location: Sydney
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Welcome aboard.
The Join stage will perform a Cartesian products for Full Outer Join if there are duplicates on the Left input.
You are right that a maximum of two inputs is specified for Full Outer Join.
Use more than one Join stage in a cascade. For example for three inputs you would use two Join stages, the second performing a Join onto the result of the first.
The Join stage will perform a Cartesian products for Full Outer Join if there are duplicates on the Left input.
You are right that a maximum of two inputs is specified for Full Outer Join.
Use more than one Join stage in a cascade. For example for three inputs you would use two Join stages, the second performing a Join onto the result of the first.
Code: Select all
Source1 ------>
Join1 ----->
Source2 ------> Join2 ----->
Source3 --------------------->
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 22
- Joined: Wed Sep 17, 2003 12:21 pm
- Location: Sydney
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
A seven way Cartesian join is going to be massive no matter how you undertake it. With only 10 rows per input, you're looking at 10 million rows (10^7) on the output.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Hey Ann,
This topic is marked as having a work around. Do you mind to share how you resolved this issue? I have two files with around 1500 disctinct values (including the non key columns) in both.
Here is the lay out.
file-1
x 1
x 2
x 3
x 4
x 5
file -2
x null 12 34
x null 13 24
x null 11 22
x null 45 23
x null 34 56
My job has to get the cartesian product, the second column values in the file-1 has to replace the null in file-2 for all the columns,
the expected out put is like this
x 1 12 34
x 1 13 24
x 1 11 22
x 1 45 23
x 1 34 56
x 2 12 34
x 2 13 24
x 2 11 22
x 2 45 23
x 2 34 56
x 3 12 34
x 3 13 24
x 3 11 22
x 3 45 23
x 3 34 56
etc...
Thanks for your help.
This topic is marked as having a work around. Do you mind to share how you resolved this issue? I have two files with around 1500 disctinct values (including the non key columns) in both.
Here is the lay out.
file-1
x 1
x 2
x 3
x 4
x 5
file -2
x null 12 34
x null 13 24
x null 11 22
x null 45 23
x null 34 56
My job has to get the cartesian product, the second column values in the file-1 has to replace the null in file-2 for all the columns,
the expected out put is like this
x 1 12 34
x 1 13 24
x 1 11 22
x 1 45 23
x 1 34 56
x 2 12 34
x 2 13 24
x 2 11 22
x 2 45 23
x 2 34 56
x 3 12 34
x 3 13 24
x 3 11 22
x 3 45 23
x 3 34 56
etc...
Thanks for your help.
-
- Premium Member
- Posts: 22
- Joined: Wed Sep 17, 2003 12:21 pm
- Location: Sydney
Cross join simplest option
To do a cross join with two files add a dummy column to both the files say "key" with a value of 1 for all rows. Then use join stage in inner join mode. For more than two files repeat the process with more such key columns and cascading inner joins.