data segment (heap) size

Ratan Babu N · Post by **Ratan Babu N** » Thu Jun 08, 2006 11:31 pm

Hi,
When I run a Job that loads data into a Db2stage it is aborting with the following messages.

Db2udbXXX,2: The current soft limit on the data segment (heap) size (2147483645)
is less than the hard limit (2147483647), consider increasing the heap size limit

Db2udbXXX,2: Fatal Error: Throwing exception: APT_BadAlloc: Heap allocation failed.

Under what circumstances it will show these messages?

ray.wurlod · Post by **ray.wurlod** » Fri Jun 09, 2006 12:00 am

Typically when your data segment size (as set by ulimit command) is not large enough. Get your UNIX administrator to make its default unlimited.

Daddy Doma · Post by **Daddy Doma** » Sun Jul 23, 2006 11:03 pm

On Unix, does this error message relate to the data or the file limits?

I have an Aggregator stage which has up to 40 million records to deal with and get the same error. Following warning messages in the log show:

My current heap size= 1,856,298,288 bytes in 35,701,573 blocks.

Followed by "Failure in operator logic" for the aggregator stage.

I checked "ulimit -a" and it shows that whilst my time(seconds) and data(kbytes) are unlimited, my file(blcoks) and coredump(blocks) are set as 2097151.

Thanx,

Zac.

ray.wurlod · Post by **ray.wurlod** » Sun Jul 23, 2006 11:16 pm

At least data and stack should be set to "unlimited" for parallel jobs.

opdas · Post by **opdas** » Mon Jul 24, 2006 5:51 am

Zac,
When dealing with large data for aggregation a good way is to set the method as "sort" and sort records just before the aggregate stage.

ray.wurlod · Post by **ray.wurlod** » Mon Jul 24, 2006 3:33 pm

Indeed. Guidelines suggest that you should use "hash table" only with fewer than about 1000 distinct values of the grouping column(s) per MB of memory.

Daddy Doma · Post by **Daddy Doma** » Mon Jul 24, 2006 6:27 pm

Thanks Om, Thanks Ray,

I've looked at my jobs and identified areas where I can fix this aggregation issue. I will add a Sort stage and repartition before each Aggregator and use the Sort Method.

This is a lesson learnt that I will not forget - things were fine when developing using only 100 records but when I tried to run using production volumes (up to 52 million records) I started to get some problems!

Thanx,

Zac.

thompsonp · Post by **thompsonp** » Tue Jul 25, 2006 1:27 am

Zac,

As Ray points out it is not the number of input records that is important. You need to consider how many results you will get. If it is approaching 1000 per MB of available memory then use a sort method for the aggregation.
Don't forget that 'available memory' is used by other stages that might be running as well as the aggregator.