unable to access Infosphere server through any of clients

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
arvind_ds
Participant
Posts: 428
Joined: Thu Aug 16, 2007 11:38 pm
Location: Manali

unable to access Infosphere server through any of clients

Post by arvind_ds »

Hi Experts,

I am not able to connect to infosphere server through any of the infosphere clients(DataStage Designer,Director,Administrator,Information Server Console,WebConsole,WorkBench,BusinessGlossary Browser,Business Glossary AnyWhere,Information Server Manager).

Getting below error messages in /opt/IBM/WebSphere/AppServer/profiles/default/logs/server1 log files.


---------------------------------------------------------------------------------------------

SystemOut.log:[6/14/10 14:49:24:008 CEST] 00000023 ThreadMonitor W WSVR0605W: Thread "ORB.thread.pool : 0" (00000039) has been active for 756729 milliseconds and may be hung. There is/are 1 thread(s) in total in the server that may be hung.
SystemOut.log:[6/14/10 14:49:24:068 CEST] 00000023 ThreadMonitor W WSVR0605W: Thread "ORB.thread.pool : 1" (0000003a) has been active for 694933 milliseconds and may be hung. There is/are 2 thread(s) in total in the server that may be hung.

SystemOut.log:[6/8/10 15:04:17:017 CEST] 00000069 ServletWrappe E SRVE0068E: Could not invoke the service() method on servlet unicorn_action_servlet. Excep
tion thrown : java.lang.OutOfMemoryError
SystemOut.log:[6/8/10 13:47:54:476 CEST] 000000c1 ExceptionUtil E CNTR0020E: EJB threw an unexpected (non-declared) exception during invocation of method "
cacheFireEvent" on bean "BeanId(ASB_managers.ear#ASB_managers.jar#Initialization, null)". Exception data: java.lang.OutOfMemoryError
SystemOut.log:[6/8/10 16:52:29:250 CEST] 000000b7 NotificationD W ADME0006W: An exception occurred sending notification [source=WebSphere:platform=proxy,ce
ll=pt-eims-d001Node01Cell,version=6.0.2.31,name=RasLoggingService,mbeanIdentifier=cells/pt-eims-d001Node01Cell/nodes/pt-eims-d001Node01/servers/server1/serve
r.xml#RASLoggingService_1224875878603,type=RasLoggingService,node=pt-eims-d001Node01,process=server1, message=null, sequence=777, type=websphere.ras.warning,
time=1275998017633, data=com.ibm.ejs.ras.RasMessageImpl2@4b5824b7] to LocalNotificationService: java.lang.OutOfMemoryError
SystemOut.log:[6/8/10 16:52:34:837 CEST] 00000069 WebApp E SRVE0026E: [Servlet Error]-[unicorn_action_servlet]: java.lang.OutOfMemoryError
SystemOut.log: java.lang.OutOfMemoryError
SystemOut.log: java.lang.OutOfMemoryError
SystemOut.log:Caused by: java.lang.OutOfMemoryError
SystemOut.log:[6/8/10 16:52:51:901 CEST] 000000ce AlarmThreadPo W Encountered a failure in the fireAlarm method java.lang.OutOfMemoryError
SystemOut.log:[6/8/10 16:52:59:021 CEST] 0000002c WebApp E SRVE0026E: [Servlet Error]-[unicorn_action_servlet]: java.lang.OutOfMemoryError
SystemOut.log:[6/14/10 17:23:16:728 CEST] 00000041 SystemOut O 2010-06-14 17:17:54,089 ERROR ojb.OjbPersistentEObjectPersistence - java.lang.OutOfMemoryE
rror
SystemOut.log:[6/14/10 20:24:39:321 CEST] 00000041 ExceptionUtil E CNTR0020E: EJB threw an unexpected (non-declared) exception during invocation of method
"loadByRid" on bean "BeanId(ACS_server.ear#ejb-LXMeta.jar#OperationalRepositoryLocalStatelessService, null)". Exception data: java.lang.OutOfMemoryError
SystemOut.log:[6/14/10 20:24:48:411 CEST] 00000041 ExceptionUtil E CNTR0020E: EJB threw an unexpected (non-declared) exception during invocation of method
"getEvents" on bean "BeanId(ACS_server.ear#ACS_server.jar#LoggingQueryService, null)". Exception data: com.ibm.ejs.container.UnknownLocalException: ; nested
exception is: com.ibm.ws.exception.WsEJBException: nested exception is: java.lang.OutOfMemoryError
SystemOut.log:Caused by: com.ibm.ws.exception.WsEJBException: nested exception is: java.lang.OutOfMemoryError
SystemOut.log:Caused by: java.lang.OutOfMemoryError



native_stderr.log:<AF[281]: warning! free memory getting short(1). (38576/1073674752)>
native_stderr.log:JVMDG217: Dump Handler is Processing OutOfMemory - Please Wait.
native_stderr.log:JVMDG274: Dump Handler has Processed OutOfMemory.

---------------------------------------------------------------------------------------------

In adition to this, below files are getting created in /opt/IBM/WebSphere/AppServer/profiles/default folder as a result of which the entire file system(/opt/IBM/Websphere) is getting exhausted.I deleted these files to release the space.

-rw-r--r-- 1 root system 366966603 Jun 15 08:36 heapdump975016.1276583638.phd
-rw-r--r-- 1 root system 2392809 Jun 15 08:38 javacore975016.1276583771.txt
-rw-r--r-- 1 root system 412833863 Jun 15 10:22 heapdump975016.1276589974.phd
-rw-r--r-- 1 root system 2614768 Jun 15 10:24 javacore975016.1276590133.txt
-rw-r--r-- 1 root system 433017428 Jun 15 10:28 heapdump975016.1276590364.phd
-rw-r--r-- 1 root system 2621373 Jun 15 10:30 javacore975016.1276590530.txt
-rw-r--r-- 1 root system 433576203 Jun 15 10:34 heapdump975016.1276590677.phd
-rw-r--r-- 1 root system 2621384 Jun 15 10:36 javacore975016.1276590843.txt
-rw-r--r-- 1 root system 433654521 Jun 15 10:39 heapdump975016.1276590981.phd
-rw-r--r-- 1 root system 2621366 Jun 15 10:41 javacore975016.1276591156.txt
-rw-r--r-- 1 root system 433812326 Jun 15 10:44 heapdump975016.1276591298.phd
-rw-r--r-- 1 root system 2621348 Jun 15 10:46 javacore975016.1276591466.txt
-rw-r--r-- 1 root system 434111820 Jun 15 10:49 heapdump975016.1276591608.phd
-rw-r--r-- 1 root system 2621374 Jun 15 10:51 javacore975016.1276591784.txt
-rw-r--r-- 1 root system 434362601 Jun 15 10:55 heapdump975016.1276591965.phd
-rw-r--r-- 1 root system 2621312 Jun 15 10:57 javacore975016.1276592143.txt
-rw-r--r-- 1 root system 434429850 Jun 15 11:01 heapdump975016.1276592295.phd
-rw-r--r-- 1 root system 2621329 Jun 15 11:03 javacore975016.1276592468.txt
-rw-r--r-- 1 root system 434430808 Jun 15 11:06 heapdump975016.1276592603.phd
-rw-r--r-- 1 root system 176967903 Jun 15 12:38 heapdump585966.1276598246.phd
-rw-r--r-- 1 root system 2506732 Jun 15 12:39 javacore585966.1276598315.txt

I tried to re-start WebSphere Application Server and then I was able to access all infosphere clients temporarily and then got the same problem the very next day.This problem never happened before.

Initial heapsize mentioned in WAS is 300 MB and maximum heapsize is 1 GB.I have 6 GB RAM and 2 CPUs on infosphere server.We use to run 30 to 40 datastage jobs each day on this server and this issue never occurred before.We are running maximum 2 to 3 jobs parallely.

Any help on this is much appreciated.
Arvind
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Re: unable to access Infosphere server through any of client

Post by ray.wurlod »

arvind_ds wrote:We use to run 30 to 40 datastage jobs each day on this server and this issue never occurred before.
What has changed?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
arvind_ds
Participant
Posts: 428
Joined: Thu Aug 16, 2007 11:38 pm
Location: Manali

Post by arvind_ds »

We have not changed anything on the server. Wondering why this error is occurring time and again. :roll:
Arvind
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I doubt that that's the case. For example, were the NodeAgents formerly started as root and are not started as a non-root user? Have ulimit values been changed? Are you using the same user to execute?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
arvind_ds
Participant
Posts: 428
Joined: Thu Aug 16, 2007 11:38 pm
Location: Manali

Post by arvind_ds »

NodeAgents are started through root user only and no change in ulimit values.
Arvind
arvind_ds
Participant
Posts: 428
Joined: Thu Aug 16, 2007 11:38 pm
Location: Manali

Post by arvind_ds »

Also we are using same user 'dsadm' to run the datastage jobs. Please note that we are using root user to start/stop - WAS,InfoSphere Server Engine & NodeAgents.
Arvind
antonyraj.deva
Premium Member
Premium Member
Posts: 138
Joined: Wed Jul 16, 2008 9:51 pm
Location: Kolkata

Post by antonyraj.deva »

Arvind,

The reason for this log out is that One or more of your jobs is increasing the server memory's heap size (Utilizing all available free memory).

This mainly happens because of the Java Virtual Machine (JVM) which Infosphere uses.

Try either to increase the memory size in the server or set "ulimit" values in dsenv to unlimited.

--Tony
arvind_ds
Participant
Posts: 428
Joined: Thu Aug 16, 2007 11:38 pm
Location: Manali

Post by arvind_ds »

Here is the ulimit values for dsadm user fetched by using ExecSH before job routine with ulimit -a expression in a sample datastage job ran by dsadm user.


Untitled2..BeforeJob (ExecSH): Executed command: ulimit -a
*** Output from command was: ***
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) 2097152
stack(kbytes) unlimited
memory(kbytes) unlimited
coredump(blocks) 2097151
nofiles(descriptors) 10240


Also the JVM heap size values are given below.
Initial heap size : 300 MB
Max heap size : 1024 MB

Do I need to change anything in these values.
Arvind
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

ulimit -d should probably be larger, ideally unlimited.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
antonyraj.deva
Premium Member
Premium Member
Posts: 138
Joined: Wed Jul 16, 2008 9:51 pm
Location: Kolkata

Post by antonyraj.deva »

Arvind,

1 GB of Max Heap Size is more than enough for any JVM, but Sun recommends 2 GB for complex applications and/or 64 bit servers.

In addition to setting ulimit -d to unlimited, set also ulimit for coredump to unlimited.

If still the problem persists then we need to take a close look into the Job design.

--Tony
antonyraj.deva
Premium Member
Premium Member
Posts: 138
Joined: Wed Jul 16, 2008 9:51 pm
Location: Kolkata

Post by antonyraj.deva »

Arvind,

Is the issue resolved now?

--Tony
Post Reply