Page 1 of 1

After massive logging activity cannot login to DS

Posted: Fri Aug 12, 2011 10:03 am
by ASU_ETL_DEV
Hello,
One of our production jobs started filling its log file with millions of messages. After that we could not stop it and any attempt to login to any project failed. We had to ask the root user to kill the job's processes.
In the proces we also kill existing client connections. We rebooted the Windows client machine and also the DS server but the problem persists.

The error is:
Failed to connect to host: XXXXXX, project: ASU_OTHER_APPS_PRD
(The connection is broken (81002))

If I try to log to another project I get the following error:
Failed to connect to host: XXXXXX, project: UV
(The connection is broken (81002))

Any suggestions to diagnose the problem would be much appreciated.
Thanks

Posted: Fri Aug 12, 2011 10:47 am
by cppwiz
Are you able to logon to the web console? (http://servername:9080)
Sometimes when a specific job gets hosed, I can't run the job or open the log. To fix this I logon to the Administrator client, click the Command button on the Projects tab and issue this command:

DS.REINDEX ALL

You could also try these commands from a Unix terminal session:
cd $DSHOME
bin/dssh
LOGTO projectname
DS.TOOLS
QUIT

Posted: Fri Aug 12, 2011 2:16 pm
by ASU_ETL_DEV
Thanks for the reply.

I can connect to the project via dssh and can query the repository table DS_JOBS. I found out which RT_LOG is the one for my job and checked its size. It is close to 1GB.
I could clear the RT_LOG. However since I can connect and log into the project with dssh but not with the clients, this sounds more like a client/server connection problem.
What other tests can I do in order to diagnose that that is actually the problem?

Posted: Fri Aug 12, 2011 2:17 pm
by ASU_ETL_DEV
Thanks for the reply.

I can connect to the project via dssh and can query the repository table DS_JOBS. I found out which RT_LOG is the one for my job and checked its size. It is close to 1GB.
I could clear the RT_LOG. However since I can connect and log into the project with dssh but not with the clients, this sounds more like a client/server connection problem.
What other tests can I do in order to diagnose that that is actually the problem?

Posted: Fri Aug 12, 2011 4:42 pm
by ASU_ETL_DEV
It turns out that the UNIX engineer who was bringing up DataStage was skipping the dsenv sourcing step. Once that was included in the start sequence I was able to connect from the client and cleanup the job's log through Director.

Thanks.