unable to run DS job
Moderators: chulett, rschirm, roy
unable to run DS job
Hi,
I have a old testing project, I can create and compile testing jobs against it but cannot run these job. Can anybody tell me what may cause this issue ?
Thanks in advance
xli
I have a old testing project, I can create and compile testing jobs against it but cannot run these job. Can anybody tell me what may cause this issue ?
Thanks in advance
xli
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
More information please Jonny. After you compile the jobs, does the status change to Compiled in Director? How are you trying to run the job - from Director, from dsjob, or by some other means? What symptom indicates that the job can not run? Are any events recorded in the job log?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Hi, Ray
The job status changes to be "compiled" in Director after I comile it. I tried running it in both Designer and Director, there is no reaction. While I tried to run it by using dsjob, it took a while, then issue message as below :
$ bin/dsjob -run Training SimpleSevTest
Error running job
Status code = -14 DSJE_TIMEOUT
Obviously, there is none in the job log
Thanks
The job status changes to be "compiled" in Director after I comile it. I tried running it in both Designer and Director, there is no reaction. While I tried to run it by using dsjob, it took a while, then issue message as below :
$ bin/dsjob -run Training SimpleSevTest
Error running job
Status code = -14 DSJE_TIMEOUT
Obviously, there is none in the job log
Thanks
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
What else is the machine doing? This code is usually an indication that the machine is overloaded; either the number of processes exceeds the limit, or the total demand for resources hugely exceeds supply. Failure to start in a timely fashion can also be influenced by too small a value for the T30FILE setting, or by very many entries in that job's log and/or in the &PH& directory. You need to check all of these things. For example, use top or sar to monitor how busy the system is.
Keep in mind, too, that a parallel job will want to create many processes; one conductor process, one section leader process per processing node, and up to one process per stage in the job design. Have you tried running on a single-node configuration file, to reduce the startup time and the total number of processes?
Keep in mind, too, that a parallel job will want to create many processes; one conductor process, one section leader process per processing node, and up to one process per stage in the job design. Have you tried running on a single-node configuration file, to reduce the startup time and the total number of processes?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Charter Member
- Posts: 34
- Joined: Tue May 31, 2005 8:17 pm
- Location: ShangHai,China
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
well, I don't think that the machine is overloaded as I can still run the same job in other project residing on the same machine. Especially, to simplify this problem, I created a very simple server job to process a few . The testing result is the same.
I presume that there were something wrong with this project. but I am not able to figure out what happened.
I presume that there were something wrong with this project. but I am not able to figure out what happened.
Well, in that case, try to restart the server.
Try executing the command in Adminsitrator client.
Make sure you get a valid count number rather than an error.
Try executing the command in Adminsitrator client.
Code: Select all
COUNT DS_JOBOBJECTS
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
xli - there are not too many objects in your projects. The reason for doing a COUNT DS_JOBOBJECTS was to make DS traverse the whole set of file keys to make sure that each link was working, otherwise you would have gotten a fatal error message.
Can you try to create a very simple server job with no stages, in the Job Control section put in the statement CALL DSLogInfo("Hello World",""). Compile and try to run this job from either the command line or director.
If you still get a timeout then there is something very wrong in the project. First stop is to use the DS.TOOLS to reindex the repository files. If that fails, just out of curiosity, try to create a dummy routine in the manager with just one line as above and see if you can compile and test it in the manager.
Can you try to create a very simple server job with no stages, in the Job Control section put in the statement CALL DSLogInfo("Hello World",""). Compile and try to run this job from either the command line or director.
If you still get a timeout then there is something very wrong in the project. First stop is to use the DS.TOOLS to reindex the repository files. If that fails, just out of curiosity, try to create a dummy routine in the manager with just one line as above and see if you can compile and test it in the manager.