Different results on multiple nodes

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
krishna81
Premium Member
Premium Member
Posts: 78
Joined: Tue May 16, 2006 8:01 am
Location: USA

Different results on multiple nodes

Post by krishna81 »

When I am running a job on single node the data and no.of records I am getting are perfect , but when running the same job on 4 (or Multi) nodes, it is acting weird. The no.of records I am getting is different(more than no.of records from source).
Have a join stage and filtering data with a constraint(column from right link of join) in transformer.

Source(CFF) -> join -> Xfrm

Both the input links to join stage are hash partitioned and are sorted on keys.
Could any one suggest me why is this behaving weird on multi nodes?
Datastage User
BI-RMA
Premium Member
Premium Member
Posts: 463
Joined: Sun Nov 01, 2009 3:55 pm
Location: Hamburg

Re: Different results on multiple nodes

Post by BI-RMA »

krishna81 wrote:Both the input links to join stage are hash partitioned and are sorted on keys.
That is: the key-columns of the Join-Stage, not the key-columns of the input-tables? Partitioning and Sorting uses the same keys for all inputs?

To be sure, it definitely sounds like a partitioning problem. Did You try to set partitioning to Auto and leave the sorting to DataStage? Your scenario does not really look like one where the extra effort for setting hashing and sorting manually was needed. Mind You: DataStage will not override false settings if You set them manually.
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
Post Reply