Page 1 of 1

Unable to get proper results using left outer join

Posted: Sun Jul 01, 2012 9:25 am
by knip1
Hi Experts,
My Job has a source file,teradata connector,join stage,dataset file
Here I do leftouter join of source records with that of records from teradata connector.sourc. File has 400 records and 50000 records from teradata connector.
My issue is when I execute my parallel job using job activity iam getting different results. And when I run my parallel job directly iam getting correct results. Actually here iam getting different results in left join,can any one please help me in this?

Posted: Sun Jul 01, 2012 1:57 pm
by chulett
Welcome.

The only real difference between running the job 'manually' and via a Job Activity stage in a Sequence job is in the latter you have the opportunity to mess up the handling of the Job Parameters. Not sure how that would cause left-outer-join issues but honest, that's really the only difference. Have you triple-checked was is being passed from the Sequence?

The first (or one of the first) log entries shows the job parameters in force and also any environment variables in play for the run of the job.

Re: Unable to get proper results using left outer join

Posted: Sun Jul 01, 2012 5:24 pm
by SURA
There wont be any difference the way how you are running the job.


1) Consider the linking order.
2) Use proper parition / auto (if you are not good in partition)
3) In the link count you must have the same records after join.

Posted: Sun Jul 01, 2012 8:04 pm
by aartlett
Welcome Knip1 to our little community.

Rules I always follow for a join stage:
1) Hash Partition on first element of join key
2) input link (or in sort stage before hand) sort on all join keys
3) Make sure youe link ordering is correct

1&2 are particularly important to make sure that the same key data is available in the same partitions and sort order is correct.

This may not be the most efficient method, but it works 100% of the time for me. Partitioning is the major cause of strangeness. Could your stand alone run be using one node and your sequence be using more? ($APT_CONFIG_FILE).

Good luck