Join stage partition

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
kpsita
Participant
Posts: 99
Joined: Tue Jul 21, 2009 11:43 pm

Join stage partition

Post by kpsita »

Hi,

I have a question regarding partition in join stage. My job design is to join two datasets. My question is, should I hash partition during this join in join stage. Because when we join two database stages the join stage will wait till all the records are read form the table and so we will get correct results. Is this the case with joining two datasets too?

Thanks
KPSITA
jhmckeever
Premium Member
Premium Member
Posts: 301
Joined: Thu Jul 14, 2005 10:27 am
Location: Melbourne, Australia
Contact:

Post by jhmckeever »

... when we join two database stages the join stage will wait till all the records are read form the table ...
This isn't true, unless you've got a sort somewhere in your job. The join will operate in a 'pipeline' fashion, regardless of whether its source data are provided by a database of dataset stage.
<b>John McKeever</b>
Data Migrators
<b><a href="https://www.mettleci.com">MettleCI</a> - DevOps for DataStage</b>
<a href="http://www.datamigrators.com/"><img src="https://www.datamigrators.com/assets/im ... l.png"></a>
jim.paradies
Premium Member
Premium Member
Posts: 25
Joined: Thu Jan 31, 2008 11:06 pm
Location: Australia

Post by jim.paradies »

This isn't true, unless you've got a sort somewhere in your job. The join will operate in a 'pipeline' fashion, regardless of whether its source data are provided by a database of dataset stage.
Joining data streams that are not pre-sorted on the join key will cause a tsort operator to be inserted in the input links if the auto partitioning method is used. In fact, the sort stage is sometimes used in a "Don't sort" mode simply to avoid re-sorting.

As to whether you need to partition, if you leave the partitioning method as auto, it should take care of itself.
Jim Paradies
Post Reply