Sorting Stage Tunning

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
XRAY
Participant
Posts: 33
Joined: Mon Apr 03, 2006 12:09 am

Sorting Stage Tunning

Post by XRAY »

Hi all

I have a job which takes 0.3 billion records and do lookup (with hashing) , sorting (with hashing and partition) and aggregation.The job use stable sort and takes 154 mins to finish.

For better performance, I made the following changes to the job

Test 1) Change to use non-stable sort , the job finished with 172 mins

Test 2) Keep using stable sort and Restrict Memory Usage = 60MB, the job finished with 146 min.

Test 3) Keep using stable sort, remove unnecessary hashing before the lookup and set Restrict Memory Usage to

a) 40MB, the job finished with 172 mins

b) 60MB, the job abort due to full scratch disk.


I would like to ask

i) Unstable sort does not do anything good but hurt the performance ?
ii) How to decide the value of "Restrict Memory Usage" ?

iii) More "Restrict Memory Usage" needs more scratch disk ? Shouldn't it only means allocate more memory to the sorting stage ?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Stable sort does require rather more memory. Do you really need a stable sort? That is, is it really part of your requirement that the original order of records is preserved for each value of sort key? In my experience it rarely is. So I'd first look at disabling stable sort.

I'd also look at cleaning up your scratch disk and, perhaps, even increasing the Restrict Memory Usage by the same amount again - that is, to 100MB per node - if you have sufficient free memory.

If all that fails, allocate more scratch disk.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply