RPC daemon is not running (81016))
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 48
- Joined: Thu May 05, 2005 9:24 pm
RPC daemon is not running (81016))
Hi All,
I re-started the DS server. When I checked for the client connections and Jobs running- there was one Job that was hanging. I faced the same problem of not able to clean it using DS.TOOLS. So I killed the process manually.
After that I stopped and started the server.
Now I get this error.
[b]Failed to connect to host: oppt.in.ibm.com, project: UV
(The connection was refused or the RPC daemon is not running (81016))[/b]
I checked the DS docs and it says to manually re-start the dsrpc deamon do a stop-start server.
I have tried it multiple times- but I see the same error.
Could anyone help us urgently as we are now unable to connect it.
Thanks a ton
Regards,
Kalyan
I re-started the DS server. When I checked for the client connections and Jobs running- there was one Job that was hanging. I faced the same problem of not able to clean it using DS.TOOLS. So I killed the process manually.
After that I stopped and started the server.
Now I get this error.
[b]Failed to connect to host: oppt.in.ibm.com, project: UV
(The connection was refused or the RPC daemon is not running (81016))[/b]
I checked the DS docs and it says to manually re-start the dsrpc deamon do a stop-start server.
I have tried it multiple times- but I see the same error.
Could anyone help us urgently as we are now unable to connect it.
Thanks a ton
Regards,
Kalyan
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Have you searched for 81016 on the forum? You might also search for ways to debug why the RPC daemon may not be starting ("-d9" will help, so might "netstat", so might "BOMBED").
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 48
- Joined: Thu May 05, 2005 9:24 pm
I am following the below mentioned thing. But it is not working
DataStage Client to UNIX Server Connections
If you cannot connect from a DataStage client to a UNIX server, check that
the dsrpcd daemon is running. The dsrpcd daemon is started when the
DataStage server is installed, and should start automatically when you
reboot. If the daemon has stopped for some reason, restart it with the
following command:
dshome/bin/uv -admin -start
dshome is the DataStage server engine home directory.
Regards,
Kalyan
DataStage Client to UNIX Server Connections
If you cannot connect from a DataStage client to a UNIX server, check that
the dsrpcd daemon is running. The dsrpcd daemon is started when the
DataStage server is installed, and should start automatically when you
reboot. If the daemon has stopped for some reason, restart it with the
following command:
dshome/bin/uv -admin -start
dshome is the DataStage server engine home directory.
Regards,
Kalyan
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
That is incomplete advice.
After starting DataStage you should check that the dsrpcd process is running.
If it is not, then you must find out why. Usually this involved the netstat command, to determine whether there are any connected DataStage processes preventing dsrpcd from binding to port number 31538.
You can also test this theory - provided you have superuser access - by starting dsrpcd manually capturing debugging information. Assuming you have the DataStage bin directory in your path, and are in the "home" directory:
Typically you get a four line log indicating the problem within a few seconds of executing the command.
Search the forum for techniques for disconnecting the sleeping connections.
After starting DataStage you should check that the dsrpcd process is running.
Code: Select all
ps -ef | grep dsrpcd | grep -v grep
You can also test this theory - provided you have superuser access - by starting dsrpcd manually capturing debugging information. Assuming you have the DataStage bin directory in your path, and are in the "home" directory:
Code: Select all
nohup dsrpcd -d9 > /tmp/dsrpcd.log 2>&1 &
Search the forum for techniques for disconnecting the sleeping connections.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 48
- Joined: Thu May 05, 2005 9:24 pm
Hi Ray,
Thanks a Ton for the reply.
I have done - what you have suggested and this is the ouput.
>nohup dsrpcd -d9 > /tmp/dsrpcd.log 2>&1 &
[1] 123012
dsrpcd.log ->
RPCPID=123012 - 11:29:25 - uvrpc_debugflag=9 (Debugging level)
RPCPID=123012 - 11:29:25 - In rpc_init()
RPCPID=123012 - 11:29:25 - bind bombed errno=67
RPCPID=123012 - 11:29:25 - listen failed
Have you ever seen this error? Let me know and in the mean time, I will try to do some search on this error.
Thanks and Regards,
Kalyan
Thanks a Ton for the reply.
I have done - what you have suggested and this is the ouput.
>nohup dsrpcd -d9 > /tmp/dsrpcd.log 2>&1 &
[1] 123012
dsrpcd.log ->
RPCPID=123012 - 11:29:25 - uvrpc_debugflag=9 (Debugging level)
RPCPID=123012 - 11:29:25 - In rpc_init()
RPCPID=123012 - 11:29:25 - bind bombed errno=67
RPCPID=123012 - 11:29:25 - listen failed
Have you ever seen this error? Let me know and in the mean time, I will try to do some search on this error.
Thanks and Regards,
Kalyan
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Kalyan,
if you issue -stop while there are live client connection to the server, the dsrpcd cannot be brought up properly when -start is launched. Ensure to close any client connection before doing -stop..
To activate dsrpcd in this situation, Ray might have better idea to do it via the netstat infor and I want to know how to do it too. However, If you are desperate and have no problem to reboot the Unix, then Reboot the whole box will reset everything back to normal, which means the dsrpcd will run again..
Hopefully it helps!
Pneuma.
if you issue -stop while there are live client connection to the server, the dsrpcd cannot be brought up properly when -start is launched. Ensure to close any client connection before doing -stop..
To activate dsrpcd in this situation, Ray might have better idea to do it via the netstat infor and I want to know how to do it too. However, If you are desperate and have no problem to reboot the Unix, then Reboot the whole box will reset everything back to normal, which means the dsrpcd will run again..
Hopefully it helps!
Pneuma.
-
- Participant
- Posts: 48
- Joined: Thu May 05, 2005 9:24 pm
Hi Ray
The netstat command gave the following output.
>netstat| grep dsr
tcp4 0 0 oppt.in.ibm.com.dsrpc vtammine.in.ibm..2816 CLOSE_WAIT
Do we have to kill the above process?
One more thing- when you said "Free the processes" Which process are you referring to?
I am not sure regarding the below mentioned one's as there are no process Identifier's etc to identify these to clear/kill them.
RPCPID=123012 - 11:29:25 - uvrpc_debugflag=9 (Debugging level)
RPCPID=123012 - 11:29:25 - In rpc_init()
RPCPID=123012 - 11:29:25 - bind bombed errno=67
RPCPID=123012 - 11:29:25 - listen failed
Could you please clarify this for me.
Thanks a Lot
Regards,
Kalyan
The netstat command gave the following output.
>netstat| grep dsr
tcp4 0 0 oppt.in.ibm.com.dsrpc vtammine.in.ibm..2816 CLOSE_WAIT
Do we have to kill the above process?
One more thing- when you said "Free the processes" Which process are you referring to?
I am not sure regarding the below mentioned one's as there are no process Identifier's etc to identify these to clear/kill them.
RPCPID=123012 - 11:29:25 - uvrpc_debugflag=9 (Debugging level)
RPCPID=123012 - 11:29:25 - In rpc_init()
RPCPID=123012 - 11:29:25 - bind bombed errno=67
RPCPID=123012 - 11:29:25 - listen failed
Could you please clarify this for me.
Thanks a Lot
Regards,
Kalyan
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Before killing the process, you might like to check (with an lsof command) that it is a DataStage process. Also use ipcs -m | grep ade to make sure that there are no DataStage processes after you've shut DataStage down, then a restart should be fine.
Did you search? There are 65 hits on netstat alone. If you search for netstat and lsof (all terms) you will narrow to a small number of useful hits.
Did you search? There are 65 hits on netstat alone. If you search for netstat and lsof (all terms) you will narrow to a small number of useful hits.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 48
- Joined: Thu May 05, 2005 9:24 pm
Hi Ray,
Thanks for all your notes and sorry for the trouble Caused!!!!!!!!!!!!!!!
Currently we have brought down the Datastage server.
checks done when the server is down.
We used the netstat|grep dsr command and it is giving the following result
>netstat|grep dsr
tcp4 0 0 oppt.in.ibm.com.dsrpc vtammine.in.ibm..2816 CLOSE_WAIT
we also used the ipcs -m|grep ade command but nothing is being displayed.
Also there is no process associated with CLOSE_WAIT
ps -ef|grep CLOSE_WAIT
dsadm 128910 123754 0 12:41:51 pts/1 0:00 grep CLOSE_WAIT
As it is not associated with a process how do we kill it? We can see this CLOSE_WAIT only while using the netstat command and not a part of ps.
Thanks and Regards,
kalyan
Thanks for all your notes and sorry for the trouble Caused!!!!!!!!!!!!!!!
Currently we have brought down the Datastage server.
checks done when the server is down.
We used the netstat|grep dsr command and it is giving the following result
>netstat|grep dsr
tcp4 0 0 oppt.in.ibm.com.dsrpc vtammine.in.ibm..2816 CLOSE_WAIT
we also used the ipcs -m|grep ade command but nothing is being displayed.
Also there is no process associated with CLOSE_WAIT
ps -ef|grep CLOSE_WAIT
dsadm 128910 123754 0 12:41:51 pts/1 0:00 grep CLOSE_WAIT
As it is not associated with a process how do we kill it? We can see this CLOSE_WAIT only while using the netstat command and not a part of ps.
Thanks and Regards,
kalyan
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The quickest solution is to stop and restart UNIX. That MUST unbind the port (given your employer I'm assuming you're not on a Tru64 cluster).
Otherwise it's a long and tedious process to identify the process associated with the CLOSE_WAIT (or FIN_WAIT or FIN_WAIT2) state so that you can remove it.
I don't have access to DataStage at the moment (doing something else) so all of my replies have been from memory. That's why I've been pushing you to search. For example, you may need to check (using ndd command) what the default TCP timeout is on the system.
Otherwise it's a long and tedious process to identify the process associated with the CLOSE_WAIT (or FIN_WAIT or FIN_WAIT2) state so that you can remove it.
I don't have access to DataStage at the moment (doing something else) so all of my replies have been from memory. That's why I've been pushing you to search. For example, you may need to check (using ndd command) what the default TCP timeout is on the system.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 48
- Joined: Thu May 05, 2005 9:24 pm
Hi Ray,
we reboted our AIX box and the problem got resolved.
We are able to connect to our DS Project.
The previous problem of Sequencer not getting deleted or compiled as we physically removed the RT_CONFIGXXX file, now when we tried to open this sequencer said- un able to find this and on refresh was removed.
So we were successfully able to rename the backedup sequncer to the current sequencer and also successfully compiled.
As of Now problems seems to be resolved. :D
I want to personally thank you and others for the Help Provided.
Have a Great day ahead.
Thanks and Regards,
Kalyan
we reboted our AIX box and the problem got resolved.
We are able to connect to our DS Project.
The previous problem of Sequencer not getting deleted or compiled as we physically removed the RT_CONFIGXXX file, now when we tried to open this sequencer said- un able to find this and on refresh was removed.
So we were successfully able to rename the backedup sequncer to the current sequencer and also successfully compiled.
As of Now problems seems to be resolved. :D
I want to personally thank you and others for the Help Provided.
Have a Great day ahead.
Thanks and Regards,
Kalyan