Jobs/Sequences getting stuck in Running status repeatedly

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
vivekgadwal
Premium Member
Premium Member
Posts: 457
Joined: Tue Sep 25, 2007 4:05 pm

Jobs/Sequences getting stuck in Running status repeatedly

Post by vivekgadwal »

In our Test environment, we are facing a repeating issue of jobs or sequences getting stuck in Running status forever, thus delaying the run of other jobs/sequences. The problem is not just with any one in particular, it is completely random.

If I do a "Cleanup resources" in the Director and "Logout" from that one process and restart the master sequence, it is running fine from that point of failure. I mean, the job that caused the issue before is running fine. Later, another one might be stuck again, or not...there is no guarantee. Can anybody suggest what might be going on with this? This has become a pain in neck because we are running processes from Test as part of our System Testing and somebody has to keep monitoring these kind of problems. If we leave it unchecked, we receive a buzz from Linux admins saying that some of our processes are hogging the CPU :)

Additional Info: If I do a ps -ef command, I am seeing DSD.PHANTOM processes that are present taking up 100% of the CPU or something related to RT_SC*, where * is any number. In the "Cleanup Resources", the "Command last executed" will be like:

Code: Select all

SH -c "RT_SC* ..."
Vivek Gadwal

Experience is what you get when you didn't get what you wanted
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Is there any compilaiton process going on in your server in parallel by any chance?
Does your Transformer do any complex operation or calling a routine?
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
vivekgadwal
Premium Member
Premium Member
Posts: 457
Joined: Tue Sep 25, 2007 4:05 pm

Post by vivekgadwal »

Thanks kumar_s for your swift response.

There is no compilation process going on simultaneously while we are running our Master sequence that we are running as far as I know. Some of the code that we have is complicated. Some jobs have at least 30 stages etc. and Transformers in those jobs have a bunch of stage variables and calculations. But, mostly - not always, the jobs that are getting hung are the ones that go to Database (Inserts or Updates). We do inserts using Bulk loading.

Each job runs an audit routine (server), which outputs the job statistics into an XML file which we later process and load into our Audit tables. Is this the info that you are looking for? If not, please let me know.
Vivek Gadwal

Experience is what you get when you didn't get what you wanted
Post Reply