unable to run DS job

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
xli
Charter Member
Charter Member
Posts: 74
Joined: Fri May 09, 2003 12:31 am

unable to run DS job

Post by xli »

Hi,

I have a old testing project, I can create and compile testing jobs against it but cannot run these job. Can anybody tell me what may cause this issue ?

Thanks in advance

xli
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

More information please Jonny. After you compile the jobs, does the status change to Compiled in Director? How are you trying to run the job - from Director, from dsjob, or by some other means? What symptom indicates that the job can not run? Are any events recorded in the job log?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
xli
Charter Member
Charter Member
Posts: 74
Joined: Fri May 09, 2003 12:31 am

Post by xli »

Hi, Ray

The job status changes to be "compiled" in Director after I comile it. I tried running it in both Designer and Director, there is no reaction. While I tried to run it by using dsjob, it took a while, then issue message as below :

$ bin/dsjob -run Training SimpleSevTest
Error running job

Status code = -14 DSJE_TIMEOUT

Obviously, there is none in the job log

Thanks
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What else is the machine doing? This code is usually an indication that the machine is overloaded; either the number of processes exceeds the limit, or the total demand for resources hugely exceeds supply. Failure to start in a timely fashion can also be influenced by too small a value for the T30FILE setting, or by very many entries in that job's log and/or in the &PH& directory. You need to check all of these things. For example, use top or sar to monitor how busy the system is.

Keep in mind, too, that a parallel job will want to create many processes; one conductor process, one section leader process per processing node, and up to one process per stage in the job design. Have you tried running on a single-node configuration file, to reduce the startup time and the total number of processes?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
xli
Charter Member
Charter Member
Posts: 74
Joined: Fri May 09, 2003 12:31 am

Post by xli »

Hi, Ray

It doesn't work even for a simplest server job. However, same server job can run with no problem in another project in the same server.

xli
daniel0623
Charter Member
Charter Member
Posts: 34
Joined: Tue May 31, 2005 8:17 pm
Location: ShangHai,China

Post by daniel0623 »

Hi,
Ever I had same issue.Pls export your job,and import into a new project.Run it in new project.You'd better restart server before running.Good luck.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Did you check any of the suggestions I made? Even though it's a server job (you posted in the parallel job forum) these suggestions are still valid.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
xli
Charter Member
Charter Member
Posts: 74
Joined: Fri May 09, 2003 12:31 am

Post by xli »

well, I don't think that the machine is overloaded as I can still run the same job in other project residing on the same machine. Especially, to simplify this problem, I created a very simple server job to process a few . The testing result is the same.

I presume that there were something wrong with this project. but I am not able to figure out what happened.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Well, in that case, try to restart the server.
Try executing the command in Adminsitrator client.

Code: Select all

COUNT DS_JOBOBJECTS 
Make sure you get a valid count number rather than an error.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
xli
Charter Member
Charter Member
Posts: 74
Joined: Fri May 09, 2003 12:31 am

Post by xli »

ok, I'll have to arrange to restart server later.

I run the COUNT DS_JOBOBJECTS against this project,

20849 records counted.

It seems there are too many objects in this project.

xli
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

xli - there are not too many objects in your projects. The reason for doing a COUNT DS_JOBOBJECTS was to make DS traverse the whole set of file keys to make sure that each link was working, otherwise you would have gotten a fatal error message.

Can you try to create a very simple server job with no stages, in the Job Control section put in the statement CALL DSLogInfo("Hello World",""). Compile and try to run this job from either the command line or director.

If you still get a timeout then there is something very wrong in the project. First stop is to use the DS.TOOLS to reindex the repository files. If that fails, just out of curiosity, try to create a dummy routine in the manager with just one line as above and see if you can compile and test it in the manager.
Post Reply