I have to replicate the following scenario in DataStage.
There is a sequential file which has been sorted using five different keys in descending order using a SyncSort program. The output of this file is given to another SyncSort program which uses three keys (these three are part of the five used earlier) and removes the duplicates and outputs to another sequential file.
Now, I have tried to replicate this by using a Sort stage followed by a Remove Duplicates stage. But we already know that if the keys used in a Sort stage and those used in a following stage are different, then a warning is shown. But here there is no other option as I have to replicate the existing scenario. I have used hash partitioning for the Sort stage and 'Same' partitioning for the Remove Dups stage.
The number of output records obtained in the PX job is same as in the Syncsort utility. But the order is jumbled up. Also, is there any way I can remove the warning in this particular scenario, ie. When checking operator: User inserted sort "Sort_stage" does not fulfill the sort requirements of the downstream operator "Remove_Dups_Stage".
Case Study with SyncSort
Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.
Moderators: chulett, rschirm, roy
Return to “IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)”
Jump to
- Moderators' Choice
- ↳ Editor's BLOG Corner
- ↳ Ask the Experts! - Dads and Grads
- ↳ DSXchange Testimonials
- ↳ Cognos (IBM BI)
- FAQs
- ↳ FAQs
- ↳ FAQ Discussion
- DataStage
- ↳ General
- ↳ IBM<sup>®</sup> Infosphere DataStage Server Edition
- ↳ IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
- ↳ Archive of DataStage Users@Oliver.com
- IBM<sup>®</sup>Infosphere Products<sup></sup>
- ↳ Business Glossary
- Suggestions
- ↳ Site/Forum
- ↳ Enhancement Wish List
- Consulting
- ↳ Talent
- ↳ Looking for Talent
- Support
- ↳ Parameter Manager
- ↳ Compile All Plus
- Usergroup Forums
- ↳ Usergroup Central Forum
- ↳ Heartland Usergroup Forum
- The Written Word
- ↳ Articles, White Papers and Tips and Tricks
- ↳ Product Documentation
- Third Party Applications
- ↳ Third Party Applications
- Product Derivatives
- ↳ Functions
- ↳ Routines
- ↳ Jobs
- ↳ Logs
- Tools
- ↳ Tools Forum
- Category
- ↳ Infosphere Master Data Management
- ↳ Data Quality Best Practices
- ↳ IBM QualityStage
- ↳ Information Analyzer (formerly ProfileStage)
- ↳ IBM<sup>®</sup> SOA Editions (Formerly RTI Services)
- ↳ IBM<sup>®</sup> DataStage TX
- ↳ BI
- ↳ Data Integration