Page 1 of 1

Stopped/started DS, RPC daemon problem

Posted: Wed Mar 30, 2005 8:04 am
by PhilHibbs
I just stopped and started DataStage using these commands:

bin/uv -admin -stop
(wait 30 seconds)
bin/uv -admin -start

When starting any DS application I get this:

Failed to connect to host: corus, project: UV
(The connection was refused or the RPC daemon is not running (81016))

I've done this loads of times, most recently yesterday (I learned early on that waiting 30 seconds between them is a good idea) but this time, it's broken. Any suggestions? All eyes are on me expectantly...

(the reason I am doing this is that large SAP IDoc loads seem to be more reliable if this is done beforehand)

Posted: Wed Mar 30, 2005 8:15 am
by mhester
Phil,

I'm not sure that 30 seconds is sufficient on its own. You should issue the following commands to ensure that nothing is returned prior to starting the services -

Code: Select all

netstat -a | grep rpc
ps -ef | grep phantom
ps -ef | grep dsapi or dsslave
If these are clean then you should be able to start, but if it returns anything then you need to wait until it is clear. I have seen in the past that this can take minutes (3 - 5) before it is clean.

Sometimes a memory segment can cause problems and you can list DataStage memory segments with the following -

Code: Select all

lpcs -mop | grep dae
This will return a listing of memory segments where it may look like -

Code: Select all

 0xdaec......
and remove them using the following -

Code: Select all

lpcrm -m [enter the ID from the above command]
I've had to use all of the above commands at one point or another to get the services started again. It's been a year or so since I was last on a Unix box, but I believe this will help you. Others may have more current information on naming convention etc...

Regards,

Posted: Wed Mar 30, 2005 8:16 am
by roy
Hi,
was already posted in previous posts.
read the manuals for more insight.

if there are active connections dsrpcd will not restart till they are removed.
sometimes there is a 5-10 minute timeout for client connections (check using netstat | grep dsrpc)
a good practice is always bring ds service down after making sure no one is connected.
if and when you brought dsrpcd service down via uv -admin -stop make sure you don't have connections, if there are any have them terminated; only then will dsrpcd service go up when you use uv -admin -start.

IHTH,

Posted: Wed Mar 30, 2005 8:19 am
by PhilHibbs

Code: Select all

netstat -a | grep rpc
That command is currently returning:

Code: Select all

tcp4       0      0  *.sunrpc               *.*                    LISTEN
udp4       0      0  *.sunrpc               *.*
Are these DataStage related?

Posted: Wed Mar 30, 2005 8:24 am
by roy
Chances are by the time your rechecking all is well so try the uv -admin -start again, you should be fine.

won't hurt checking for zombie phantom jobs in case you had any abnormal terminations.

oops forgot no they are not relevant you need to check for *.dsrpc entries

Posted: Wed Mar 30, 2005 10:55 am
by PhilHibbs
mhester wrote:

Code: Select all

lpcs -mop | grep dae
I don't have an lpcs or lpcrm command.

Do you mean ipcs and ipcrm? (thanks to Google for that suggestion)

Posted: Wed Mar 30, 2005 11:49 am
by roy
IMHO he ment that, but having a sysadmin at hand performing tasks like this is an advice worth taking.

Posted: Wed Mar 30, 2005 12:26 pm
by chulett
PhilHibbs wrote:Do you mean ipcs and ipcrm? (thanks to Google for that suggestion)
Yes, he did. :wink:

Posted: Wed Mar 30, 2005 12:52 pm
by mhester
I fat fingered the i and l - sorry!

Posted: Wed Mar 30, 2005 6:01 pm
by chinek
PhilHibbs wrote:

Code: Select all

netstat -a | grep rpc
That command is currently returning:

Code: Select all

tcp4       0      0  *.sunrpc               *.*                    LISTEN
udp4       0      0  *.sunrpc               *.*
Are these DataStage related?
No, those ones are not the DataStage one. If there are they would look like earth.34902 earth.dsrpc 32768 0 32768 0 FIN_WAIT_2

Best bet is to kill off all Data Stage processes in your unix server before restarting. Also make sure all client sessions are logoff.

Nick

check ps before starting uv.

Posted: Thu Mar 31, 2005 2:51 am
by goma
It would be good idea to insert check logic for ds process rather than waiting for 30 second.

bin/uv -admin -stop
(wait 30 seconds)
bin/uv -admin -start

the above script become like this.

bin/uv -admin -stop

ps -ef | grep phantom > temp.ps
ps -ef | grep dsapi or dsslave >> temp.ps

<< count the record in temp.ps >>

if < the number of record != 0 >
wait for serveral minutes and run ps above again.

bin/uv -admin -start

Re: Stopped/started DS, RPC daemon problem

Posted: Thu Mar 31, 2005 8:56 am
by PhilHibbs
Thanks for all the suggestions.

It now turns out that stopping and starting datastage is not a panacea for large IDoc loads, as I just did that and it went more badly wrong than it has ever done before. Oh well.

Posted: Thu Mar 31, 2005 9:57 am
by roy
Hi,
one flaw in goma's method!
you should make shure no one is connected before performing uv -admin -stop.

Posted: Thu Mar 31, 2005 6:58 pm
by goma
Oh, yes. Thanks roy,

it is necessary to check at least phantom and dsapi_slave process, are the y all to check?