Page 1 of 1

Job running forever....

Posted: Fri Apr 27, 2007 9:44 am
by madhukar
in one of the run, a job stayed in running mode for ever and eventually it had to be killed.

series of activities happened:
Job sequences triggered through control-m.
first job sequence completed successfully, second sequence started
the first job in the second sequence stayed in running mode forever
the job even didn't start reading the file. the job log produced only :

Starting Job <jobname>.
pxPRSLoadWrkrPvt02..BeforeJob (ExecSH): Executed command: touch <filename> *** No output from command ***
Environment variable settings:
Parallel job initiated
Parallel job default NLS map ISO-8859-1, default locale OFF
Advanced runtime options used: -default_date_format "%yyyy-%mm-%dd"

In this case, how to go about debugging without aborting the job?

PS: job ran successfully when it was killed and ran again.

Posted: Fri Apr 27, 2007 9:46 am
by DSguru2B
How often does this happen? Are you purging your log files regularly. Also keep the &PH& folder size in check?

Posted: Fri Apr 27, 2007 9:52 am
by madhukar
this happened for first time.
auto purge of log has been set for one day.

Posted: Thu May 03, 2007 12:22 am
by madhukar
problem is recurring now...
can somebody help in this regard??

Posted: Thu May 03, 2007 12:41 am
by nick.bond
Can you see a process for the job running? You need to work out if the job is actually running, i.e. has a process or whether the process has died and it is a problem with the way director is reporting the status of the job.

Posted: Thu May 03, 2007 6:09 am
by madhukar
if i see the process in Unix, the job's process is in in SLEEP state with last column (COMMAND) as phantom...
i used unix top command.

Posted: Thu May 03, 2007 7:09 am
by chulett
So... in other words, there's no problem? Phantom just means DataStage background process, which all jobs run as and SLEEP just means it's very tired. :wink:

Just how busy is your server? What exactly is your job doing? Could there be other resource issues that are causing the job to stall - database, perhaps?

Posted: Thu May 03, 2007 3:49 pm
by nick.bond
Can yoy povide a description of what you job consists of? Which stages? Files? Databases? Before job routines?

Posted: Sat May 05, 2007 12:45 am
by madhukar
after a lot of research what we found is

if the job is compiled by other than the user, used to start it, job hangs.
now after compiling the jobs with user used to start it, jobs are running fine.

Any thoughts on this?

Posted: Sat May 05, 2007 1:47 am
by ray.wurlod
Permissions in the project directory and/or umask settings for the users are inappropriate.

Posted: Sat May 05, 2007 7:00 am
by chulett
I guess I haven't really seen this because, outside of the Dev environment anyway, a single functional id is used to compile and run all jobs. Always done it that way.