Hi,
I am writing data from a sequential file to a dataset.
The volume of the data ranges from 8 millions to 20 millions for different files.
I need this data to be sorted based on a single key.
I am not sure which sorting is better for sorting for better performance with this volume of data
Please help me in knowing which is better
My version is 8.5
Sort stage or Link Sort
Moderators: chulett, rschirm, roy
Sort stage or Link Sort
Thanks,
HK
*Go GREEN..Save Earth*
HK
*Go GREEN..Save Earth*
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Use an explicit Sort stage. Partition data by the sort key.
The Sort stage allows you to allocate more memory than the default to the sorting operation, which means it takes longer before the sort has to spill to scratchdisk.
You can control the default with an environment variable called APT_TSORT_STRESS_BLOCKSIZE but beware that this is a global change across the scope of the variable (project or job).
The Sort stage allows you to allocate more memory than the default to the sorting operation, which means it takes longer before the sort has to spill to scratchdisk.
You can control the default with an environment variable called APT_TSORT_STRESS_BLOCKSIZE but beware that this is a global change across the scope of the variable (project or job).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.