Help needed for Join Stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
jack_dcy
Participant
Posts: 18
Joined: Wed Jun 29, 2005 9:53 pm

Help needed for Join Stage

Post by jack_dcy »

Hi
we use a Join Stage with inner join, the left link is a sequential file and the right link is a sequential file then a sort stage and a remove duplicates stage ,it should be output some records, but when we set two nodes in the configuration file, there have nothing output, but when we set only one node in the configuration file, it have the exact result. or we preserve two nodes, and rewrite the sort and remove duplicates stage, the result is OK as well.

why?
benny.lbs
Participant
Posts: 125
Joined: Wed Feb 23, 2005 3:46 am

Re: Help needed for Join Stage

Post by benny.lbs »

try to Hash the input before joined
jack_dcy wrote:Hi
we use a Join Stage with inner join, the left link is a sequential file and the right link is a sequential file then a sort stage and a remove duplicates stage ,it should be output some records, but when we set two nodes in the configuration file, there have nothing output, but when we set only one node in the configuration file, it have the exact result. or we preserve two nodes, and rewrite the sort and remove duplicates stage, the result is OK as well.

why?
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

As we know sequential file wont work under parllelism, will this some way related to this :roll:
ranga1970
Participant
Posts: 141
Joined: Thu Nov 04, 2004 3:29 pm
Location: Hyderabad

Post by ranga1970 »

Kumar is right I bilieve, sequential file can not be used in parrellism

thanks
RRCHINTALA
sandy
Participant
Posts: 24
Joined: Sun Feb 01, 2004 1:14 am

Post by sandy »

Sequential files can be used in parallel jobs having the join operation. Try partitioning both the inputs to the join stage from within the join stage properties. Partition the data based on the key columns of the inner join.

IHTH.
jack_dcy
Participant
Posts: 18
Joined: Wed Jun 29, 2005 9:53 pm

Post by jack_dcy »

benny.lbs

Thanks, it is OK when I set the Hash partition type for the input before join.
Join stage can't set a fit partition type with Auto?

ranga1970,benny.lbs

If both of this two links's resource data is sequentail file. what can we do?
benny.lbs
Participant
Posts: 125
Joined: Wed Feb 23, 2005 3:46 am

Post by benny.lbs »

Have a try option "Number Of Readers Per Node" >= 2

Sequential stage 's processing will be much faster, but this no. depend on your server 's capability
ranga1970 wrote:Kumar is right I bilieve, sequential file can not be used in parrellism

thanks
jack_dcy
Participant
Posts: 18
Joined: Wed Jun 29, 2005 9:53 pm

Post by jack_dcy »

Anyone can explain that we only relink and rewrite the sort and remove duplictes stage, can solve this problem as well? Is it a bug of DSEE?
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

I am not discounting the possibility that you found a bug however somebody may have experimented with different partitioning settings in the remove duplicates and sort stage and caused the problem, or the stages might have been copied in from another job with incorrect settings, and the unlinking and relinking reverted the settings back to Auto.
Post Reply