Job Design for section of Interest.
Code: Select all
Remove Duplicate Aggregator
Remove Duplicate Aggregator
Remove Duplicate Aggregator
Remove Duplicate Aggregator
Transformer Join Stage DataSet
Remove Duplicate Aggregator
Remove Duplicate Aggregator
Remove Duplicate Aggregator
Remove Duplicate Aggregator
Code: Select all
ProductID Partition,Sort
OrderDt Partition,Sort
CompanyName Sort
Person1Name Sort
Person2Name Sort
OrderID Sort
Each output link of transformer has a different constraint for the record to move out.
All Remove Duplicate stages were removing duplicates on the above 6 fields.
All aggregate Stages were grouping on following fields and doing record count.
Code: Select all
ProductID
OrderDt
CompanyName
Person1Name
Person2Name
Then I changed the input link of transformer for the following:
Code: Select all
ProductID Partition,Sort
OrderDt Partition,Sort
CompanyName Partition,Sort
Person1Name Partition,Sort
Person2Name Partition,Sort
OrderID Sort
I am not understanding why the thing was not working with partition on 2 fields and later working when partition on 5 fields. To me it should have work in either cases.
Also please let me know if you see any other design/partitioning issues with the above job design.