Query about Remove duplicates, join stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Query about Remove duplicates, join stage

Post by zulfi123786 »

Hi

Is is mandatory that the remove duplicates stage should be provided with sorted data? What if the data is not sorted explicity and forcing DataStage not to insert any sorts.... would it cause any data issues ?

Same question goes for Join stage and Change Data Capture stage

Consider that we are hashing the data on the keys.

Please advice
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

This really is a case where a 1-minute job (2 x row generator, 1 join, 1 peek) will answer your question for you. If you disable sort generation and feed the stages unsorted data they will fail.
Post Reply