Page 1 of 1

Diff results in each run

Posted: Thu Jan 19, 2012 6:38 pm
by gagan8877
I wrote a job in which dataset is read is matched to a reference output which is coming from Oracle table. The match occurs via 10 parallel lookups on 10 different columns. The reference table always produced exactly the same no. of rows. The dataset also produces exactly the same no. of input rows. But the Lookup stages produce different no. of rows each time (job run). The parameter values are exactly the same in each run (no change in letter case either for strings). I changed all lookups to execute sequentially - but the problem remains. Any ideas?

More Info about the job:
https://picasaweb.google.com/wizlogic/S ... _DiffRslts

(moderator: removed impressively long but singularly unhelpful generated OSH)

Posted: Thu Jan 19, 2012 9:22 pm
by ray.wurlod
In spite of its impressive length the generated OSH is singularly unhelpful in answering this question. More useful would be the score. In the meantime you could advise us how the data are partitioned, particularly on the inputs to the Lookup stage, and whether anything (such as the contents - not the same as the number of rows - in the Oracle table)changes between runs.

Posted: Fri Jan 20, 2012 8:14 am
by qt_ky
It sounds like your partitioning may not be correct. Lookup reference links on SMP should be set to entire or auto. What do you have them set to?

Why are your screen shots not showing any link markers (little icons) for partitioning or collecting?

The only thing I see in the osh is a lot of -pp flags. Are you clearing the "Preserve partitioning" setting a lot?

Try running the entire job on a single node more than once, rather than tinkering with a few stages. Tell us if you get the same behavior as before.

Posted: Fri Jan 20, 2012 8:29 am
by chulett
qt_ky wrote:It sounds like your partitioning may not be correct.
Agreed. Anytime I hear "different results" running the same data through multiple times I suspect faulty partitioning. And I too would be curious what happens when the job runs on a single node.

Posted: Fri Jan 20, 2012 10:40 am
by josejohny
In Screen shot , RMVDUP_UP_HIER stage (Remove duplicate stage) has different out put. So check the partition in remove duplicate stage.

Found the culprit

Posted: Sat Jan 21, 2012 4:05 pm
by gagan8877
Hi All

Thanks for all you replies. I tried with single node config file but the problem didn't resolve. I noticed that the problem only occurs if the job is run in the sequence, if I run it individually it produces the same results each time. This led me to focus on the job running upstream. Which had a logic flaw because of a remove duplicates stage that was producing diff dataset with each run. This dataset was the source for the next job. So marking it resolved.

Gary