Page 1 of 1

Help needed for Join Stage

Posted: Wed Jul 20, 2005 9:16 pm
by jack_dcy
Hi
we use a Join Stage with inner join, the left link is a sequential file and the right link is a sequential file then a sort stage and a remove duplicates stage ,it should be output some records, but when we set two nodes in the configuration file, there have nothing output, but when we set only one node in the configuration file, it have the exact result. or we preserve two nodes, and rewrite the sort and remove duplicates stage, the result is OK as well.

why?

Re: Help needed for Join Stage

Posted: Wed Jul 20, 2005 9:36 pm
by benny.lbs
try to Hash the input before joined
jack_dcy wrote:Hi
we use a Join Stage with inner join, the left link is a sequential file and the right link is a sequential file then a sort stage and a remove duplicates stage ,it should be output some records, but when we set two nodes in the configuration file, there have nothing output, but when we set only one node in the configuration file, it have the exact result. or we preserve two nodes, and rewrite the sort and remove duplicates stage, the result is OK as well.

why?

Posted: Wed Jul 20, 2005 11:32 pm
by kumar_s
As we know sequential file wont work under parllelism, will this some way related to this :roll:

Posted: Wed Jul 20, 2005 11:35 pm
by ranga1970
Kumar is right I bilieve, sequential file can not be used in parrellism

thanks

Posted: Thu Jul 21, 2005 12:05 am
by sandy
Sequential files can be used in parallel jobs having the join operation. Try partitioning both the inputs to the join stage from within the join stage properties. Partition the data based on the key columns of the inner join.

IHTH.

Posted: Thu Jul 21, 2005 12:06 am
by jack_dcy
benny.lbs

Thanks, it is OK when I set the Hash partition type for the input before join.
Join stage can't set a fit partition type with Auto?

ranga1970,benny.lbs

If both of this two links's resource data is sequentail file. what can we do?

Posted: Thu Jul 21, 2005 12:44 am
by benny.lbs
Have a try option "Number Of Readers Per Node" >= 2

Sequential stage 's processing will be much faster, but this no. depend on your server 's capability
ranga1970 wrote:Kumar is right I bilieve, sequential file can not be used in parrellism

thanks

Posted: Thu Jul 21, 2005 1:42 am
by jack_dcy
Anyone can explain that we only relink and rewrite the sort and remove duplictes stage, can solve this problem as well? Is it a bug of DSEE?

Posted: Thu Jul 21, 2005 5:27 pm
by vmcburney
I am not discounting the possibility that you found a bug however somebody may have experimented with different partitioning settings in the remove duplicates and sort stage and caused the problem, or the stages might have been copied in from another job with incorrect settings, and the unlinking and relinking reverted the settings back to Auto.