Hi Experts,
My Job has a source file,teradata connector,join stage,dataset file
Here I do leftouter join of source records with that of records from teradata connector.sourc. File has 400 records and 50000 records from teradata connector.
My issue is when I execute my parallel job using job activity iam getting different results. And when I run my parallel job directly iam getting correct results. Actually here iam getting different results in left join,can any one please help me in this?
Unable to get proper results using left outer join
Moderators: chulett, rschirm, roy
Welcome.
The only real difference between running the job 'manually' and via a Job Activity stage in a Sequence job is in the latter you have the opportunity to mess up the handling of the Job Parameters. Not sure how that would cause left-outer-join issues but honest, that's really the only difference. Have you triple-checked was is being passed from the Sequence?
The first (or one of the first) log entries shows the job parameters in force and also any environment variables in play for the run of the job.
The only real difference between running the job 'manually' and via a Job Activity stage in a Sequence job is in the latter you have the opportunity to mess up the handling of the Job Parameters. Not sure how that would cause left-outer-join issues but honest, that's really the only difference. Have you triple-checked was is being passed from the Sequence?
The first (or one of the first) log entries shows the job parameters in force and also any environment variables in play for the run of the job.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Re: Unable to get proper results using left outer join
There wont be any difference the way how you are running the job.
1) Consider the linking order.
2) Use proper parition / auto (if you are not good in partition)
3) In the link count you must have the same records after join.
1) Consider the linking order.
2) Use proper parition / auto (if you are not good in partition)
3) In the link count you must have the same records after join.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
Welcome Knip1 to our little community.
Rules I always follow for a join stage:
1) Hash Partition on first element of join key
2) input link (or in sort stage before hand) sort on all join keys
3) Make sure youe link ordering is correct
1&2 are particularly important to make sure that the same key data is available in the same partitions and sort order is correct.
This may not be the most efficient method, but it works 100% of the time for me. Partitioning is the major cause of strangeness. Could your stand alone run be using one node and your sequence be using more? ($APT_CONFIG_FILE).
Good luck
Rules I always follow for a join stage:
1) Hash Partition on first element of join key
2) input link (or in sort stage before hand) sort on all join keys
3) Make sure youe link ordering is correct
1&2 are particularly important to make sure that the same key data is available in the same partitions and sort order is correct.
This may not be the most efficient method, but it works 100% of the time for me. Partitioning is the major cause of strangeness. Could your stand alone run be using one node and your sequence be using more? ($APT_CONFIG_FILE).
Good luck
Andrew
Think outside the Datastage you work in.
There is no True Way, but there are true ways.
Think outside the Datastage you work in.
There is no True Way, but there are true ways.