Page 1 of 1

Aggregator - problem with memory

Posted: Thu Jul 01, 2004 7:02 pm
by eoyylo
Hi,
i have an aggragator that must aggregate 20M of records but it abort.
I suppose that can be a memory problem. If the aggregator manage until 6-7M of record work well. Over this amount don't work.
How can i resolve this problem?
i tried to use the sort plug_in but it is very slow. Can the sort plug_in aggregate the records?

thanks in advance

Mario

Posted: Thu Jul 01, 2004 7:13 pm
by vmcburney
On those older versions of DS Aggregator should be renamed Aggrevator. If your source data is in a table you may get much better performance by doing the aggregation in the source database plug-in. In DataStage the sort stage will sort but not aggregate.

Run your job again and keep an eye on temp file space as the job runs. The aggregator writes a lot of data to temporary files while it aggregates the input data.

Posted: Thu Jul 01, 2004 7:22 pm
by chulett
Actually, it's the Sort stage that uses temp files, the Aggregator works in memory without landing anything. Unless things were different back in 5.x but I don't believe so. :?

You can substantially reduce the amount of memory (and time) used by the Aggregator by presorting the data and asserting the sort order in the Aggregator stage by marking the appropriate fields. Then again, this advantage may be offset by the amount of time and resources it takes to sort the data in job. If the Sort stage is too slow, is there any way you can deliver the data to job sorted? Perhaps a simple sort at the UNIX level or some external sorting package you may have access to? Or can the data be created in the order required to support the aggregation?

Posted: Thu Jul 01, 2004 8:05 pm
by vmcburney
Thanks for the clarification.

I remember last time I had to aggregate a large amount of data the aggregation stage would eventually fail and I had to resort to putting the data into a staging table and aggregating in a database stage.