Page 1 of 1

Different results on multiple nodes

Posted: Thu Oct 27, 2011 7:06 am
by krishna81
When I am running a job on single node the data and no.of records I am getting are perfect , but when running the same job on 4 (or Multi) nodes, it is acting weird. The no.of records I am getting is different(more than no.of records from source).
Have a join stage and filtering data with a constraint(column from right link of join) in transformer.

Source(CFF) -> join -> Xfrm

Both the input links to join stage are hash partitioned and are sorted on keys.
Could any one suggest me why is this behaving weird on multi nodes?

Re: Different results on multiple nodes

Posted: Thu Oct 27, 2011 8:33 am
by BI-RMA
krishna81 wrote:Both the input links to join stage are hash partitioned and are sorted on keys.
That is: the key-columns of the Join-Stage, not the key-columns of the input-tables? Partitioning and Sorting uses the same keys for all inputs?

To be sure, it definitely sounds like a partitioning problem. Did You try to set partitioning to Auto and leave the sorting to DataStage? Your scenario does not really look like one where the extra effort for setting hashing and sorting manually was needed. Mind You: DataStage will not override false settings if You set them manually.