Join stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dsscholar
Premium Member
Premium Member
Posts: 195
Joined: Thu Oct 19, 2006 2:45 pm

Join stage

Post by dsscholar »

Hi Guys,

In join stage, i am aware that the parallel engine will insert a tsort operator to do the sort operation, which in turn requires temp space in scratch disk to store the temporary files. If i dont do the sort operation by giving "dont sort environmental variable", temp space wont be required. How does the join based on the key happens here. It uses the temp space to do the join operation or it do in "on-fly" during database access. If yes, does join requires less temp space than sort. Please explain this scenario.

Thanks in advance.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The Join stage will not produce correct results, or may run out of memory, if the data are not sorted. If you prevent insertion of a tsort operator (without providing your own sorting on the input links), your job is likely to abort.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
dsscholar
Premium Member
Premium Member
Posts: 195
Joined: Thu Oct 19, 2006 2:45 pm

Post by dsscholar »

Thanks Ray!

Does the join stage use the sratch disk space for doing the join operation or it uses the database temp space or it dont use temp space and just do the join and display the results "on-fly" with the database

Thanks in advance.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

No, no and no.

The join stage is performing it's work in memory, which is why it can run out of memory as Ray mentioned. It doesn't use scratch space, temp space or database space.

Buffers on the input links may use scratch or temp space on your servers.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
vijaykumarpj
Participant
Posts: 30
Joined: Sat Dec 19, 2009 5:19 am
Location: Manchester, UK

Post by vijaykumarpj »

To avoid inserting the Tsort operator, you can add explicit sorter stage, before the Join stage.
Thanks,
VJ.
Post Reply