Aggregator - problem with memory

eoyylo · Post by **eoyylo** » Thu Jul 01, 2004 7:02 pm

Hi,
i have an aggragator that must aggregate 20M of records but it abort.
I suppose that can be a memory problem. If the aggregator manage until 6-7M of record work well. Over this amount don't work.
How can i resolve this problem?
i tried to use the sort plug_in but it is very slow. Can the sort plug_in aggregate the records?

thanks in advance

Mario

vmcburney · Post by **vmcburney** » Thu Jul 01, 2004 7:13 pm

On those older versions of DS Aggregator should be renamed Aggrevator. If your source data is in a table you may get much better performance by doing the aggregation in the source database plug-in. In DataStage the sort stage will sort but not aggregate.

Run your job again and keep an eye on temp file space as the job runs. The aggregator writes a lot of data to temporary files while it aggregates the input data.

chulett · Post by **chulett** » Thu Jul 01, 2004 7:22 pm

Actually, it's the Sort stage that uses temp files, the Aggregator works in memory without landing anything. Unless things were different back in 5.x but I don't believe so.

You can substantially reduce the amount of memory (and time) used by the Aggregator by presorting the data and asserting the sort order in the Aggregator stage by marking the appropriate fields. Then again, this advantage may be offset by the amount of time and resources it takes to sort the data in job. If the Sort stage is too slow, is there any way you can deliver the data to job sorted? Perhaps a simple sort at the UNIX level or some external sorting package you may have access to? Or can the data be created in the order required to support the aggregation?

vmcburney · Post by **vmcburney** » Thu Jul 01, 2004 8:05 pm

Thanks for the clarification.

I remember last time I had to aggregate a large amount of data the aggregation stage would eventually fail and I had to resort to putting the data into a staging table and aggregating in a database stage.