Unable to get proper results using left outer join

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
knip1
Participant
Posts: 5
Joined: Tue Nov 30, 2010 11:57 am

Unable to get proper results using left outer join

Post by knip1 »

Hi Experts,
My Job has a source file,teradata connector,join stage,dataset file
Here I do leftouter join of source records with that of records from teradata connector.sourc. File has 400 records and 50000 records from teradata connector.
My issue is when I execute my parallel job using job activity iam getting different results. And when I run my parallel job directly iam getting correct results. Actually here iam getting different results in left join,can any one please help me in this?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Welcome.

The only real difference between running the job 'manually' and via a Job Activity stage in a Sequence job is in the latter you have the opportunity to mess up the handling of the Job Parameters. Not sure how that would cause left-outer-join issues but honest, that's really the only difference. Have you triple-checked was is being passed from the Sequence?

The first (or one of the first) log entries shows the job parameters in force and also any environment variables in play for the run of the job.
-craig

"You can never have too many knives" -- Logan Nine Fingers
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Re: Unable to get proper results using left outer join

Post by SURA »

There wont be any difference the way how you are running the job.


1) Consider the linking order.
2) Use proper parition / auto (if you are not good in partition)
3) In the link count you must have the same records after join.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
aartlett
Charter Member
Charter Member
Posts: 152
Joined: Fri Apr 23, 2004 6:44 pm
Location: Australia

Post by aartlett »

Welcome Knip1 to our little community.

Rules I always follow for a join stage:
1) Hash Partition on first element of join key
2) input link (or in sort stage before hand) sort on all join keys
3) Make sure youe link ordering is correct

1&2 are particularly important to make sure that the same key data is available in the same partitions and sort order is correct.

This may not be the most efficient method, but it works 100% of the time for me. Partitioning is the major cause of strangeness. Could your stand alone run be using one node and your sequence be using more? ($APT_CONFIG_FILE).

Good luck
Andrew

Think outside the Datastage you work in.

There is no True Way, but there are true ways.
Post Reply