UNIX dsapi_slave

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ebertocco
Participant
Posts: 6
Joined: Fri Jan 16, 2004 5:10 am
Location: Brazil

UNIX dsapi_slave

Post by ebertocco »

Hi,
Before telling my problem let me explain a bit of what I know.
I'm working with DataStage 5.2 under UNIX and the Server is not at the same city where I work. I've seen that every time I open a new Client a process called dsapi_slave is started on UNIX and every time I close the Client it disappears.
The problem I'm having is that DataStage is crashing. When I look into UNIX I see that there are lots of dsapi_slave processes running (more than 100). The strange is that there are only 2 people using DataStage and we never open more than 4 Clients at the same time and when it crashes, these dsapi_slave processes start alone, one after another, until they make DataStage crash.
When it happens I have to contact UNIX Administrators to put it down, kill all dsapi_slave processes, wait until all connections to DataStage are closed (netstat -a|grep uv) and then put it up again.
You can imagine my face every time I see that it crashed: :shock:
If anybody knows the reason it happens please help me. It's getting hard to keep my DW up to date.
Thanks,
Eduardo Bertocco
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Re: UNIX dsapi_slave

Post by kcbland »

ebertocco wrote:The problem I'm having is that DataStage is crashing.
You mean your clients sessions, not the DataStage daemon. Correct?
ebertocco wrote: When it happens I have to contact UNIX Administrators to put it down, kill all dsapi_slave processes, wait until all connections to DataStage are closed (netstat -a|grep uv) and then put it up again.
If your client session has something stupid happen and crash, then just go kill the dsapi_slaves yourself. You have the permissions to do so, they should be showing under your login.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ebertocco
Participant
Posts: 6
Joined: Fri Jan 16, 2004 5:10 am
Location: Brazil

Re: UNIX dsapi_slave

Post by ebertocco »

kcbland wrote:
ebertocco wrote:The problem I'm having is that DataStage is crashing.
You mean your clients sessions, not the DataStage daemon. Correct?
ebertocco wrote: When it happens I have to contact UNIX Administrators to put it down, kill all dsapi_slave processes, wait until all connections to DataStage are closed (netstat -a|grep uv) and then put it up again.
If your client session has something stupid happen and crash, then just go kill the dsapi_slaves yourself. You have the permissions to do so, they should be showing under your login.
No, I mean my Server (or daemon) crashes. I have to stop the Server, kill the dsapi_slave processes (I have permission to kill them but when this problem happens it doesn't work, that's why I have to call UNIX Admin). If I don't do this the dsapi_slave processes keep multiplying and I can't open new Clients.
Eduardo Bertocco
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Re: UNIX dsapi_slave

Post by kcbland »

ebertocco wrote: No, I mean my Server (or daemon) crashes. I have to stop the Server, ...
This is impossible. If the daemon crashes on the Unix server, there is nothing to STOP! Are you talking about your PC? If your Client sessions on your PC crash, this will leave slave processes on the Unix Server. You can telnet to the Unix server, login in using your Unix login, and kill zombie slave processes.

If the daemon is crashing on the Unix server, you can't startup anymore clients on your PC, so your statements can't be correct.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ebertocco
Participant
Posts: 6
Joined: Fri Jan 16, 2004 5:10 am
Location: Brazil

Re: UNIX dsapi_slave

Post by ebertocco »

kcbland wrote:
ebertocco wrote: No, I mean my Server (or daemon) crashes. I have to stop the Server, ...
This is impossible. If the daemon crashes on the Unix server, there is nothing to STOP! Are you talking about your PC? If your Client sessions on your PC crash, this will leave slave processes on the Unix Server. You can telnet to the Unix server, login in using your Unix login, and kill zombie slave processes.

If the daemon is crashing on the Unix server, you can't startup anymore clients on your PC, so your statements can't be correct.
Maybe I'm not being clear as my English is not so good. Despite of that, this problem really seems crazy. I'll try again: when I say that the daemon crashes it means that DS Services are still running (I see them in the UNIX) but I can't work anymore because when I try to open a Client I get the message: "The connction was refused or the RPC daemon is not running", the DS Jobs that were running at that time are still there (I see them in the UNIX but they never end and they don't accept my kill commands), the dsapi_slave processes (more than 100) are running in the UNIX and new ones keep starting automatically (I don't know what cause these dsapi_slave to start and they also don't accept my kill commands), I can't stop DS Services (uv -admin -stop: it just work after the UNIX Administrator kill all the dsapi_slave processes with the root login), if I try to use DS after the UNIX Administrator kills all the dsapi_slave processes it doesn't work anyway so I have to stop it (uv -admin -stop) and I just can start it again if there is no dsapi_slave processes running and neither connections open (netstat -a|grep uv). Some times I have network problems it breaks the connection between my Client and the Server and this cause the creation of a new dsapi_slave but when I close my Client and re-open the dsapi_slave processes get normal again. This problem I'm telling normally happens during the night, when there is nobody working with DS. There is no Client open. Only schedulled jobs are running. I apreciate your attention but I really need someone who have seen this before. I've already worked together with my UNIX Administrator in order to find out what is going on but we coudn't find the cause for that.
Eduardo Bertocco
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Okay, if you don't want my assistance, I'll back out of the discussion after I tell you this to try, because some might say I'm an expert on this product.

When you start a client up and try to connect to a server, the server launches a slave process to handle communicating your PC with the job design repository on the server. If your PC keeps losing its connections, those slaves become zombies and sit out there. If a kill -15 doesn't stop the slave, then a kill -9 is okay. If you read the S99ds.rc script, you'll see that the logic does this when shutting down the daemon.

So, you should be able to kill your zombie slaves that way, which won't requre recycling services. Now, the reason you probably can't connect anymore is because you have reached a point where all further connections are refused. You get the same message if you simply typed your password in wrong. It's a generic message that's been around since the beginning of the product, and is masking the real error message.

Now, if your network is dropping your connections, then that's just the way it is. I also predict you will be designing jobs and now have to deal with jobs locked. You can search this forum for those issues and how to resolve them. Ultimately, you need to fix the network issue, then DataStage propensity for zombie slaves won't be so noticeable.

Well that's it. If you don't want my assistance, I'll gladly save my time and allow the others to help you.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ebertocco
Participant
Posts: 6
Joined: Fri Jan 16, 2004 5:10 am
Location: Brazil

Post by ebertocco »

kcbland wrote:Okay, if you don't want my assistance, I'll back out of the discussion after I tell you this to try, because some might say I'm an expert on this product.

When you start a client up and try to connect to a server, the server launches a slave process to handle communicating your PC with the job design repository on the server. If your PC keeps losing its connections, those slaves become zombies and sit out there. If a kill -15 doesn't stop the slave, then a kill -9 is okay. If you read the S99ds.rc script, you'll see that the logic does this when shutting down the daemon.

So, you should be able to kill your zombie slaves that way, which won't requre recycling services. Now, the reason you probably can't connect anymore is because you have reached a point where all further connections are refused. You get the same message if you simply typed your password in wrong. It's a generic message that's been around since the beginning of the product, and is masking the real error message.

Now, if your network is dropping your connections, then that's just the way it is. I also predict you will be designing jobs and now have to deal with jobs locked. You can search this forum for those issues and how to resolve them. Ultimately, you need to fix the network issue, then DataStage propensity for zombie slaves won't be so noticeable.

Well that's it. If you don't want my assistance, I'll gladly save my time and allow the others to help you.
I apologise for that. Your assistance is really welcome. I'll try to kill my slaves as you told me. What I still don't understand is why these slaves keep appearing from nowhere as it normally happens during the night where there is no Client open and so there is no slaves running. Another thing I'd like to ask is: If I can kill all the hubdreds of dsapi_slave do you think that the job processes that are locked in my UNIX will keep runing? As I've seen until today is that these job processes keep existing in my UNIX but actully they don't seem to be running.
Eduardo Bertocco
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Are you on AIX?
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

When a DataStage client connects to the server, a dsapi_server process is started (by the dsrpcd daemon) to manage the connection between server and client. If any work has to be done, which is usually the case, the dsapi_server process will spawn a child dsapi_slave process. These processes are unrelated to running DataStage jobs; they relate only to connected DataStage clients.

Next time this happens, get your UNIX system administrator to take a look at the state of each of these processes; in particular are they zombies and are their parent (dsapi_server) processes still running.

It may be beneficial to check their state via netstat to determine if they're waiting for a TCP timeout to occur. A UNIX system administrator can shorten the TCP timeout interval using the ndd command, iirc.

I'd also be curious to determine why this is happening as often as your post seems to indicate. Are there network drop-out issues? What is your DataStage inactivity timeout (set in the dsrpcservices file)?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
larryoceanview
Participant
Posts: 70
Joined: Fri Dec 26, 2003 3:14 pm
Location: Plantation, FL

Do Slave Zombies effect performance?

Post by larryoceanview »

When a DataStage client connects to the server, a dsapi_server process is started (by the dsrpcd daemon) to manage the connection between server and client. If any work has to be done, which is usually the case, the dsapi_server process will spawn a child dsapi_slave process. These processes are unrelated to running DataStage jobs; they relate only to connected DataStage clients.
Do these Slave processes that are zombies cause performance problem.
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Re: Do Slave Zombies effect performance?

Post by ogmios »

I've seen it happen that "slave zombies" cause performance problems.

You can see this e.g. by checking via 'ps -ef' how many 'CPU seconds' are being used by a slave process. Not all of the slaves show this behaviour but it seems some slaves keep processing stuff after a client session has a broken connection.

The trick - when you would be running all jobs under the same user id - would be to kill all slaves at 01:00 AM e.g.

Regards,
Ogmios
ebertocco
Participant
Posts: 6
Joined: Fri Jan 16, 2004 5:10 am
Location: Brazil

Re: Do Slave Zombies effect performance?

Post by ebertocco »

I've been out for some time but I'm back again.
I've tried every thing that you all told me to do. I was able to kill all the dsapi_slave processes that appeared using "kill -9" command but they kept appearing again and even after killing all dsapi_slave I wasn't able to open any clients anymore. Once more I had to stop DS services and start it again.
Eduardo Bertocco
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

It's time to call technical support and get some value out of that maintenance agreement. The forum is good for quick and dirty support, but it sounds like you probably have a known issue. If you're on 5.x, most bugs have been found and documented. I suspect that you have some issue specific to your OS (I asked if you're on AIX, but you never answered).

Please let us know what the resolution is so that everyone benefits. Hey, this is post 1000!.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ebertocco
Participant
Posts: 6
Joined: Fri Jan 16, 2004 5:10 am
Location: Brazil

Post by ebertocco »

My client no longer has Tchnical Support from Ascential and doesn't want to pay for it so I don't know how to help him. We have 2 different environments and 2 different versions of DataStage (because of a Joint Venture) and only this environment I'm responsible for has the problem (Version 5.2 running on SUN platform). On the other environment (Version 5.1 running on HP platform) it doesn't happen.
Thank you For your attention and help. I'll keep trying to find the problem and as soon as I solve it I'll let you know.
Eduardo Bertocco
Post Reply