Heap Allocation failed

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Kirtikumar
Participant
Posts: 437
Joined: Fri Oct 15, 2004 6:13 am
Location: Pune, India

Heap Allocation failed

Post by Kirtikumar »

Hi,

One the jobs in production is failing with error : APT_CombinedOperatorController,1: Fatal Error: Throwing exception: APT_BadAlloc: Heap allocation failed.

The job is simple: DataSet to ODBC without any stage in between. There is no repartioning done by us. It is set to Auto.

I searched on DSX a lot for heap problems. But it did not help till now. We made the prod user limits to unlimited for all options. Still the job is failing with the same error. In the first run we ran it with many other jobs and it failed. In the second run we ran it alone when there was no other job running. Still it failed.

The only possible thing to be tried now is database limits. Or DB user is not a windows user. I think it is just created for DB accesses. So just wondering how can one check DB users limit to see what could be the problem.

If nothing works out, then we are going to add a copy stage (forced = true) and give it try. Will post the results.

Any suggestion by that time are welcome.
Regards,
S. Kirtikumar.
Kirtikumar
Participant
Posts: 437
Joined: Fri Oct 15, 2004 6:13 am
Location: Pune, India

Post by Kirtikumar »

I also checked memory on Unix box. The real memory is completely used for the whole day, but virtual usage is just 10% for all the time.

Memory on windows box where SQL server is installed - real memory utilised avgly 95%, virtual is usage is 30%.
Regards,
S. Kirtikumar.
Kirtikumar
Participant
Posts: 437
Joined: Fri Oct 15, 2004 6:13 am
Location: Pune, India

Post by Kirtikumar »

After doing one more scan of the posts under this category, I saw one more post from Ray which said check the limits for $USER env variable. In prod I do not have rights to log on as another user.

But then I added command "whoami; ulimit -a" and it returned following:

Code: Select all

*** Output from command was: ***
gecd_liv
time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         1310720
stack(kbytes)        unlimited
memory(kbytes)       unlimited
coredump(blocks)     unlimited
nofiles(descriptors) unlimited
Which means there is a limit of 1310720KB or 1342177280 bytes which is reported in logs.

When I check limits for gecd_liv, they are unlimited. Which to me sounds like internal user is different i.e. $USER when running the job and job used its limits.

We are just trying to change the limits for $USER now and see how it goes. Fingers crossed...will post the result.
Regards,
S. Kirtikumar.
Kirtikumar
Participant
Posts: 437
Joined: Fri Oct 15, 2004 6:13 am
Location: Pune, India

Post by Kirtikumar »

We checked the limit for the $USER and it is unlmited for all. Still ulimit when run from DS jobs, is returning a limit of data segment?

Is there any DSENV parameter that needs to be changed so this limit changes?

Any thoughts?
Regards,
S. Kirtikumar.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There's nothing in dsenv.

There is another thread open on this question - see if reading that helps your diagnosis.

What is the value of USER environment variable in the second log event (environment variables) for the job?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply