DSXchange

Posted: **Tue May 11, 2010 10:59 pm**

Hi,

How to perform self join using join stage in parallel
is it possible to perform a self join.

Thanks

Posted: **Wed May 12, 2010 12:00 am**

Hi,

I think you need to use the same file as right and left. then you can achive the self join in datastage.

Posted: **Wed May 12, 2010 12:43 am**

Magesh_bala wrote:Hi,

I think you need to use the same file as right and left. then you can achive the self join in datastage.

Is it possible do with a single input file ?

Posted: **Wed May 12, 2010 12:46 am**

Yes, both the Join stage inputs read data from the same file.

Posted: **Wed May 12, 2010 1:00 am**

ray.wurlod wrote:Yes, both the Join stage inputs read data from the same file. ...

i want to use only one input stage not two same input stage for example

Posted: **Wed May 12, 2010 3:08 am**

You can always include something like a copy and stream them into two.

Posted: **Wed May 12, 2010 10:26 pm**

ksk wrote: i want to use only one input stage not two same input stage for example

But a join requires two input streams.

Posted: **Wed May 12, 2010 11:11 pm**

Yes. And your point is?

Code: Select all

                        +----------------+
                        |                V
      SeqFile  ---->  Copy              Join  -------> 
                        |                ^
                        +----------------+

Posted: **Thu May 13, 2010 11:24 pm**

I understand the OP to mean he only wants one input into a Join stage, which requires two. So to do a self-join, I'm with you...two inputs, but both are the one physical input.

Posted: **Fri May 14, 2010 6:30 am**

No, they stated "one input stage" not one input into the Join, which (as noted) is not possible. And that solution has been posted. More than once.

Posted: **Fri May 14, 2010 6:45 am**

You're probably better off landing your data into a dataset, and depending on volumes using a lookup or joining the data in a second job. Of course if you insist on using only 1 input stream, you could copy this into 2 sorts based on the join keys and merge your data together. This will have rubbish results if your join key is not unique.

DSXchange

how to perform self join

how to perform self join

Re: how to perform self join

Re: how to perform self join