Help needed for Join Stage

jack_dcy · Post by **jack_dcy** » Wed Jul 20, 2005 9:16 pm

Hi
we use a Join Stage with inner join, the left link is a sequential file and the right link is a sequential file then a sort stage and a remove duplicates stage ,it should be output some records, but when we set two nodes in the configuration file, there have nothing output, but when we set only one node in the configuration file, it have the exact result. or we preserve two nodes, and rewrite the sort and remove duplicates stage, the result is OK as well.

why?

benny.lbs · Post by **benny.lbs** » Wed Jul 20, 2005 9:36 pm

try to Hash the input before joined

jack_dcy wrote:Hi
we use a Join Stage with inner join, the left link is a sequential file and the right link is a sequential file then a sort stage and a remove duplicates stage ,it should be output some records, but when we set two nodes in the configuration file, there have nothing output, but when we set only one node in the configuration file, it have the exact result. or we preserve two nodes, and rewrite the sort and remove duplicates stage, the result is OK as well.

why?

kumar_s · Post by **kumar_s** » Wed Jul 20, 2005 11:32 pm

As we know sequential file wont work under parllelism, will this some way related to this

ranga1970 · Post by **ranga1970** » Wed Jul 20, 2005 11:35 pm

Kumar is right I bilieve, sequential file can not be used in parrellism

thanks

sandy · Post by **sandy** » Thu Jul 21, 2005 12:05 am

Sequential files can be used in parallel jobs having the join operation. Try partitioning both the inputs to the join stage from within the join stage properties. Partition the data based on the key columns of the inner join.

IHTH.

jack_dcy · Post by **jack_dcy** » Thu Jul 21, 2005 12:06 am

benny.lbs

Thanks, it is OK when I set the Hash partition type for the input before join.
Join stage can't set a fit partition type with Auto?

ranga1970,benny.lbs

If both of this two links's resource data is sequentail file. what can we do?

benny.lbs · Post by **benny.lbs** » Thu Jul 21, 2005 12:44 am

Have a try option "Number Of Readers Per Node" >= 2

Sequential stage 's processing will be much faster, but this no. depend on your server 's capability

ranga1970 wrote:Kumar is right I bilieve, sequential file can not be used in parrellism

thanks

jack_dcy · Post by **jack_dcy** » Thu Jul 21, 2005 1:42 am

Anyone can explain that we only relink and rewrite the sort and remove duplictes stage, can solve this problem as well? Is it a bug of DSEE?

vmcburney · Post by **vmcburney** » Thu Jul 21, 2005 5:27 pm

I am not discounting the possibility that you found a bug however somebody may have experimented with different partitioning settings in the remove duplicates and sort stage and caused the problem, or the stages might have been copied in from another job with incorrect settings, and the unlinking and relinking reverted the settings back to Auto.

DSXchange

Help needed for Join Stage

Help needed for Join Stage

Re: Help needed for Join Stage