Page 1 of 1

java::lang::OutOfMemoryError from dsjob command

Posted: Thu Feb 25, 2016 2:39 pm
by meet_deb85
Hi All,

We have got into a strange situation. In our production environment we have moved a new project where we are triggering Parallel jobs through a UNIX Shell Script and the scrip in turn call a dsjob command.

Every now and then we are getting the java::lang::OutOfMemoryError and our script is failing. The job for which we get this error is not consistent and happening very randomly. Please note this is not happening when we are running the jobs individually or through a DS Sequence. Strangely after a failure the DS Director still shows the status as successful. Its just the script which is giving the error:
Status code = 0
terminate called after throwing an instance of 'java::lang::OutOfMemoryError'
JVMDUMP039I Processing dump event "abort", detail "" at 2016/02/24 07:04:13 - please wait.
JVMDUMP032I JVM requested System dump using '<DSInstalltionPath>/Server/DSEngine/bin/core.20160224.070413.27717.0001.dmp' in response to an event
JVMPORT030W /proc/sys/kernel/core_pattern setting "|/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e" specifies that the core dump is to be piped to an external program. Attempting to rename either core or core.27757.

JVMDUMP010I System dump written to <DSInstalltionPath>/Server/DSEngine/bin/core.20160224.070413.27717.0001.dmp
JVMDUMP032I JVM requested Snap dump using '<DSInstalltionPath>/Server/DSEngine/bin/Snap.20160224.070413.27717.0002.trc' in response to an event
JVMDUMP010I Snap dump written to <DSInstalltionPath>/Server/DSEngine/bin/Snap.20160224.070413.27717.0002.trc
JVMDUMP007I JVM Requesting JIT dump using '<DSInstalltionPath>/Server/DSEngine/bin/jitdump.20160224.070413.27717.0003.dmp'
JVMDUMP010I JIT dump written to <DSInstalltionPath>/Server/DSEngine/bin/jitdump.20160224.070413.27717.0003.dmp
JVMDUMP013I Processed dump event "abort", detail "".


Also when this error comes we can see 4 files getting created at the
$DSEngine/bin directory:
javacore.20160222.131129.13369.0002.txt
Snap.20160222.131129.13369.0003.trc
jitdump.20160222.131129.13369.0004.dmp
core.20160222.142701.17390.0001.dmp

And the .txt file has a whole lot of information which I am not able to decode.

We have tried a lot of options like increasing the heap size of JAVA, restarting the IIS and DSEngine services, restarting server and checked the java lib path in dsenv but nothing is helping. A PMR has been raised with IBM but they haven't not provided any resolution.

Before migrating the project to PROD everything ran successfully on DEV and QA environment using the same set of script and jobs.

Really appreciate any help on this.

Posted: Thu Feb 25, 2016 3:17 pm
by PaulVL
I would look at what else is/was running on the box at the time of failure. You could be the victim not the culprit.

you could run an nmon trace on the box to gather info during your test window.