Page 1 of 1

Join Stage Varying Results

Posted: Tue Apr 09, 2013 8:03 am
by nvalia
Hi,

DS 8.7 on Windows

I am doing a left Outer join (Join Stage) on an Integer key field and both inputs are Hash Partitioned and Sorted (inbuilt Join stage sort used) on the same key (No Nulls on either links)

But when the same job runs multiple times I am getting varying number of Duplicate records from this join? (Dups are expected from this join)

Any ideas/suggestions on how to solve this?

Thanks,
NV

Posted: Tue Apr 09, 2013 8:20 am
by BI-RMA
When running the same job with the same input-data using hash-partitioning and sorting on identical columns of both input-links to a join you can't get different results in subsequent runs.

So when you did not change the jobs design it is very likely that your input-data has changed between the first and a later run of the job.

Posted: Tue Apr 09, 2013 8:25 am
by nvalia
The input data has Definately not changed as I am the one controlling it..I can say this with certainity

I also know I should not get different results, but I am and hence checking if there is anything else I can do in the design/flow

Posted: Tue Apr 09, 2013 9:00 am
by prasannakumarkk
Did you see any warnings in the director. Did you clear the propagate partition in previous stages?

Posted: Tue Apr 09, 2013 10:31 am
by nvalia
BI-RMA you are correct.

The source data was varying since for testing there was a top n clause used to restrict the data..

So Datatstage does give predictable results when partitioned and sorted correctly, as expected.