Perfomance tuning in sort stage.

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
santhob
Participant
Posts: 4
Joined: Fri Jan 25, 2008 6:01 am
Location: chennai

Perfomance tuning in sort stage.

Post by santhob »

Hi,
Iam working in datastage 7.5, In one of the job we are processing more than 400 million records which fetches data from seq file and udb stage and feed to UDB stage. we are using copy, filter and sort stage as intermediate to do business validation. I need to reduce the run time of the, Job usually takes more than 5 to 6 hours.

Sort stage might be taking significant time so iam looking for change something in sort stage. Sort stage based on a key to remove duplicate and sort data in ascending order. It is possible to reduce time.

We have one option in properties tab of sort stage.
Restrict memory usage set to default
Whether increasing the Restrict memory usage will increase performance of the job?

Please advise me to do performance tuning. __.____._
veera24
Premium Member
Premium Member
Posts: 150
Joined: Thu Feb 07, 2008 9:37 pm
Location: NewYork

Re: Perfomance tuning in sort stage.

Post by veera24 »

santhob wrote:Hi,
Iam working in datastage 7.5, In one of the job we are processing more than 400 million records which fetches data from seq file and udb stage and feed to UDB stage. we are using copy, filter and sort stage as intermediate to do business validation. I need to reduce the run time of the, Job usually takes more than 5 to 6 hours.

Sort stage might be taking significant time so iam looking for change something in sort stage. Sort stage based on a key to remove duplicate and sort data in ascending order. It is possible to reduce time.

We have one option in properties tab of sort stage.
Restrict memory usage set to default
Whether increasing the Restrict memory usage will increase performance of the job?

Please advise me to do performance tuning. __.____._
Hi,
You can try this command in transformer's stage properties.

sort -t"~" -k1,1 -k2,2 -k3,3 -k4,4 -k5,5 -k6,6 -k7,7 FILE1 > FILE2

Here,
~ ----> the delimiter (You can change as per your delimiter)
-k1,1,-k2,2 etc.... ----> the key columns based on that you want to to sort
File1: Input File name
File2:output file name

But i'm not sure that it will work in parallel. Beacuse i've used this command in Server edition. If it works in parallel too, then kindly let me know.

Thanks,
Veera
Thanks in advance...
veera...
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

"might be"? Have you measured anything?

Increasing the amount of memory allocated to the Sort stage may help, provided that you have spare memory capacity.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply