Sorting and DataSets
Posted: Tue Jul 19, 2005 11:40 am
The documentation suggests that for components with a sort requirement, DS will analyze the incoming flow to determine if the sort criteria has already been met by a previous (sort) stage. If this criteria has not been met it will insert a sort internal to the stage.
Questions:
1.) Will DataStage perform this sort analysis across jobs? Suppose I have 2 jobs in a sequence. Job A sorts by key X and writes to sequential file, and Job B reads in sequential file from A and wants to aggregate grouped by key X. Will DS be smart enough to recognize that while executing this sequence, a sort in the second job will not be necessary?
2.) Same scenario, but using DataSets rather than sequential files. Is there any information in the dataset format that DS can use to know that it is sorted already?
Questions:
1.) Will DataStage perform this sort analysis across jobs? Suppose I have 2 jobs in a sequence. Job A sorts by key X and writes to sequential file, and Job B reads in sequential file from A and wants to aggregate grouped by key X. Will DS be smart enough to recognize that while executing this sequence, a sort in the second job will not be necessary?
2.) Same scenario, but using DataSets rather than sequential files. Is there any information in the dataset format that DS can use to know that it is sorted already?