Join Stage Varying Results

nvalia · Post by **nvalia** » Tue Apr 09, 2013 8:03 am

Hi,

DS 8.7 on Windows

I am doing a left Outer join (Join Stage) on an Integer key field and both inputs are Hash Partitioned and Sorted (inbuilt Join stage sort used) on the same key (No Nulls on either links)

But when the same job runs multiple times I am getting varying number of Duplicate records from this join? (Dups are expected from this join)

Any ideas/suggestions on how to solve this?

Thanks,
NV

BI-RMA · Post by **BI-RMA** » Tue Apr 09, 2013 8:20 am

When running the same job with the same input-data using hash-partitioning and sorting on identical columns of both input-links to a join you can't get different results in subsequent runs.

So when you did not change the jobs design it is very likely that your input-data has changed between the first and a later run of the job.

nvalia · Post by **nvalia** » Tue Apr 09, 2013 8:25 am

The input data has Definately not changed as I am the one controlling it..I can say this with certainity

I also know I should not get different results, but I am and hence checking if there is anything else I can do in the design/flow

prasannakumarkk · Post by **prasannakumarkk** » Tue Apr 09, 2013 9:00 am

Did you see any warnings in the director. Did you clear the propagate partition in previous stages?

nvalia · Post by **nvalia** » Tue Apr 09, 2013 10:31 am

BI-RMA you are correct.

The source data was varying since for testing there was a top n clause used to restrict the data..

So Datatstage does give predictable results when partitioned and sorted correctly, as expected.