To speed up the aggregator

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

ICE
Participant
Posts: 249
Joined: Tue Oct 25, 2005 12:15 am

To speed up the aggregator

Post by ICE »

Dear All,

Is there any way to speed up the aggregator?
Any suggestion pls.


Thanks,
ICE
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

pre-sort the incoming data and tell the aggravator about it.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You might also investigate option #6 on the DS.TOOLS menu, where you can modify the reporting intervals and memory allocation model.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Wow, version 4. You could use a Universe stage, write all your data into it and pull it out with a group by SQL command. You could upgrade a couple versions and try multiple instance jobs. Upgrade to DataStage EE parallel jobs with much faster sort and aggregation functions.
ICE
Participant
Posts: 249
Joined: Tue Oct 25, 2005 12:15 am

Post by ICE »

Dear Ray,

Just now I am checking the option#6 from DS.tools.
I see there are some options such as ReportAfterRows, ReportAfterTime,TableSize, and so on ... to choose.

So may I know which option would be maximize the performance of the aggregator?
Is it the TableSize option???

Thanks in advance,
ICE

ray.wurlod wrote:You might also investigate option #6 on the DS.TOOLS menu, where you can modify the reporting intervals and memory allocation model. ...
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

At the menu prompt enter each number followed by a question mark; for example

Code: Select all

2?
- this will give you more information. Which ones to change depends on where your performance problem is. Increasing the reporting interval (so that the stage updates its status less frequently) will always help.

Sorted input (sorted by grouping keys) will give the best gains.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ICE
Participant
Posts: 249
Joined: Tue Oct 25, 2005 12:15 am

Post by ICE »

Thank you, Ray.

Thank you all for your advice.
:)
ray.wurlod wrote:At the menu prompt enter each number followed by a question mark; for example

Code: Select all

2?
- this will give you more information. Which ones to change depends on where your performance problem is. ...
karrisuresh
Participant
Posts: 57
Joined: Sat Jun 09, 2007 1:14 am
Location: chicago

Post by karrisuresh »

Hi always
sort the data before it is sent to aggregator

so that sll the data belonging to one particular group
will go to one chunk and hence grouping at aggr stage will become faster

thanks
suresh

ArndW wrote:pre-sort the incoming data and tell the aggravator about it. ...
Hi I have experience in parallel extender datastage I am ready to give/take help from other
hope we all help each other hand in hand
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

karrisuresh wrote:always sort the data before it is sent to aggregator
As noted, there is more to it than that. You also have to assert the sorted order in the stage so it knows you've done this. And hopefully you've sorted in such a manner that supports the grouping being done, otherwise it's all for naught.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ICE
Participant
Posts: 249
Joined: Tue Oct 25, 2005 12:15 am

Post by ICE »

Dear Chulett and karrisuresh,

Could u pls let me know whether I need to use sort stage before I sent to aggregator?
Currently I am using the sort stage to sort the data before I sent to aggregator. But I got below error message.

ImptoTEST.sort14: Unable to retrieve value for property 'SortSpec

I have no experience for sort stage.

Thanks you all for your advice.
ICE

chulett wrote:
karrisuresh wrote:always sort the data before it is sent to aggregator
As noted, there is more to it than that. You also have to assert the sorted order in the stage so it knows you've done this. And hopefully you've sorted in such a manner that supports the grouping being done, otherwise it's all for naught.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The correct spelling of the second person personal pronoun is "you" not "u".
:x

You can use any method you like to sort the data. A Sort stage is one way. If the source is a database table, you could include an ORDER BY clause in the extraction SQL.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

As noted, many sort options exist, the specific stage is only one such option available. Check the Property Help for the Sort Specification in the stage, it has examples of the syntax needed. Yes, it's a little confusing if you've never used it before, but check the help and give that a shot.

Just as an FYI, that stage is pretty slow. If you can do this in the source, say an order by in a database, that would generally be more performant. Or perhaps leverage a high-speed sort package or just the plain old 'sort' command from the O/S. All of them typically beat the pants off the Sort stage. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ICE
Participant
Posts: 249
Joined: Tue Oct 25, 2005 12:15 am

Post by ICE »

Hummmm...........

I use "u" as short word of "you" due to boring to type a lot of words :P
Ok. Later will try to type full word. :(
Thank you for your advice :-)

Thank you,
ICE

ray.wurlod wrote:The correct spelling of the second person personal pronoun is "you" not "u".
:x

You can use any method you like to sort the data. A Sort stage is one way. If the source ...
ICE
Participant
Posts: 249
Joined: Tue Oct 25, 2005 12:15 am

Post by ICE »

Oop!!! Really???
Ok. I think I better use the order by clause in my sql. Thank you for your advice.

Thanks,
ICE
chulett wrote:As noted, many sort options exist, the specific stage is only one such option available. Check the Property Help for the Sort Specification in the stage, it has examples of the syntax needed. Yes, it's a little confusing if you've never used it before, but check the help and give that a shot.

Just as an FYI, that stage is pretty slow. If you can do this in the source, say an order by in a database, that would generally be more performant. Or perhaps leverage a high-speed sort package or just the plain old 'sort' command from the O/S. All of them typically beat the pants off the Sort stage. :wink:
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Yes, best to do that if you can unless your database sort speed is horrible. Oh, and...

You are welcome. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply