Job performance drops

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
michaelsarsnorum
Participant
Posts: 31
Joined: Wed Aug 31, 2005 1:55 am

Job performance drops

Post by michaelsarsnorum »

I have a job that seems to have stopped. It still shows up as running and seems OK, only that it's been running for 5 hrs when it usually takes about 3 mins.

The number of rows processed has been standing still for the last 4 hrs.

The job is quite simple. Reads from a table and writes to a sequential file.

I've tried stopping the job through the director, but this has no effect at all. Anyone got any ideas as to what can cause this? (There are no local database/pl\sql processes invoked by the job).

Any help is apreciated.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

If you do a UNIX fuser {yourSeqFile} does it show any users that have it open? It might be that the process has stopped or aborted and not updated the status file that DS uses to see if a job has finished.
michaelsarsnorum
Participant
Posts: 31
Joined: Wed Aug 31, 2005 1:55 am

Post by michaelsarsnorum »

The file is created by the job. None of the user that are supposed to use the file have the rights to access this instance of it. I transfer the file to a different server once the job is complete.

The thing is that the job has procesed about 16000 rows and seems to be standing still.

m.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Use the fuser command to see if the file is still open by DS. If it isn't then most likely your process has failed and not updated the status file. If it is open, check the PID to see if it is your job.

This is only step 1 in diagnosing the problem.
michaelsarsnorum
Participant
Posts: 31
Joined: Wed Aug 31, 2005 1:55 am

Post by michaelsarsnorum »

The file is not open.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

OK, so your job died. You can use the Director to clear your status file or just recompile the job to get it back into a runnable state. You might also want to check with your DBA to see if there are still connections from your user open in the database (there shouldn't be); and perhaps he/she can monitor the connection when you re-run the job. I would look at the DB connectivity for the error considering the simple job flow you mentioned.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

HI,
Are you expecting more than 16000 rows to appear in the file as the output from the table?

-Kumar
michaelsarsnorum
Participant
Posts: 31
Joined: Wed Aug 31, 2005 1:55 am

Post by michaelsarsnorum »

Yes. The file should be quite a lot larger.

The file that currently resides in the destination folder, created by the job in question, is about 93% smaller than the file generated last month.

I've checked that there is enough disk space on the server to actually dump the data onto the disk.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Your job is tragically dying in some way. Monitor its execution via "ps -ef|grep yourjobname" to see the active stage processes of the job design. Do it right now to see if there's zombies leftover. Otherwise, you can always see if the job is still live this way. Please describe the general design, ie stages and flow.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
michaelsarsnorum
Participant
Posts: 31
Joined: Wed Aug 31, 2005 1:55 am

Post by michaelsarsnorum »

I can't get the ps -ef to give me the right result. This is a solaris server, ps works sligtly different from on Linux at least. In addition I'm using DS 5.2 on this machine.

The Jobs are still running. Have now been running for 10 hrs.

The only output I get from ps that I think is related to Datastage is this:
root 1169 1 0 10:38:29 ? 0:00 /opt/app/Ascential/5_2/DataStage/unishared/unirpc/unirpcd
dsadm 2640 2639 0 14:25:46 ? 2:14 dsapi_slave 8 7 0
dsadm 2639 1169 0 14:25:45 ? 0:01 dscs 4 0 0
dsadm 3713 3712 0 19:19:41 ? 0:11 dsapi_slave 8 7 0


I reckon the 2 slave processes are the ones that represent the two jobs that seem to hang. Is it safe for me to issue a kill <pid> on the two of them? As mentioned previously stopping the jobs in Director has no effect at all. None of the two have processed a single row more since the last post I did.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

First you need to logout your client connections (Director, Designer, Manager) then see if there are any processes left under "dsadm". If yes, you can issue a kill {pid} but DO NOT issue a kill -9 {pid}.

I don't think your processes are still running, though. As has been stated before, the jobs have terminated (do you see a "core" file in the project directory with today's date?) but the Director shows them running since the status file hasn't been updated. You can check by trying to compile the job - if it compiles then what we've stated about the job dying is correct, if you get an error stating that the job is running then your assumption is correct.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

dsadpi and dscs are Client sessions, don't kill them unless they are related to Clients that were tragically disconnected.

If "ps -ef|grep phantom" doesn't show any threads with your jobname somewhere in it then that means there are no active threads executing. You need to Clear Status on the job from Director to get it back into an executable state, or recompile the job.

What's wrong with "ps -ef", that works on Solaris and linux just fine.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Post Reply