When would we implement a sort routine prior to the aggregator? I know that if the aggregator input tab partitioning is set to Auto, Datastage takes care of the sorting and grouping. What concerns me is the documentation. It states that we should sort and repartition the data prior to the aggregator. Is it really required because the Director log states a sort and grouping is done on the keys that are selected. I have test both mechanism and they both send back the exact same data.
Is repartitioning only used with the duplicate and join stages.
Could some please elaborate.
Thanks
Question regarding Sort and Partitioning
Moderators: chulett, rschirm, roy
Question regarding Sort and Partitioning
Jim Stewart
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Using an explicit Sort stage allows YOU to control the resources allocated to sorting and, perhaps, the ability to assert that some or all of the key columns are already sorted/grouped. Relying upon DataStage to insert tsort operators means that you get default allocations, which may not be optimal.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.