Page 1 of 1

PX Jobs reports failure (code 255)

Posted: Fri Feb 11, 2005 7:18 am
by NewPXUser
We are using PX DS v7. Recently after bouncing DS and the Unix server we are receiving the following error message in all PX jobs.

It says
Starting job jobname'...
Environment variable settings
Parallel job initiated
Contents of phantom output file nnnn,nnnn
Parallel job reports failure (code 255)
Job jobname aborted

Can anybody let me know whether they have seen this before, what it means and how to rectify it.

Some Server jobs are also affected but not sure whether they are interlinked.

Any feedback is appreciated.

Posted: Fri Feb 11, 2005 7:46 am
by ds_is_fun
Check all default configuration settings. Check installation manual for all default settings.
Is your default.apt setting file been configured?

Posted: Fri Feb 11, 2005 8:12 am
by NewPXUser
As mentioned, the jobs were running file before and all configurations were setup as required.

Can you please be specific on what to look at and what to verify so it will be easy to trace the lead.

Posted: Fri Feb 11, 2005 8:41 am
by ArndW
Hello NewPX,

as ds_is_fun requested, you need to know if you are using the default configuration file. You can see this in the Director log entry for a job run. Does your job process any rows at all or does it abort before processing (again, you can see this from the Director)? Can you compile jobs or does DS have problems with that? If you create a silly trivial new DS job (read from SEQ into a PEEK for instance) does that compile and does it generate the same error at runtime?

Posted: Fri Feb 11, 2005 8:53 am
by NewPXUser
It uses a config file designed for 4 processors. It was the case from the time when it was running ok.

The parallel jobs compile ok. All parallel jobs, however small including a 1 - 1 mapping, fail with this error.

Can somebody tell me what this error imply? Also is there a list of error codes that we can obtain to check against?

Posted: Fri Feb 11, 2005 9:12 am
by ArndW
The 255 might not help in this case, it looks like High-value on an 8bit number; this is why the questions are coming in regarding other methods of narrowing down the possible cause. Have you looked into the detail records of the Director log file to see if there is some more text there somewhere? You might also go to the project directory, into the subdirectory &PH& and look at the files for the last run (sort the list by date, there will be a lot of files in there) there might be some additional information to be seen there as well.

Posted: Fri Feb 11, 2005 12:13 pm
by ds_is_fun
Arnd,
Do you think it could be a compiler related problem?
If all configs are fine and if the user earlier ran it successfully earlier, then I guess it needs some kind of patch management with Ascential.

Posted: Fri Feb 11, 2005 5:24 pm
by ray.wurlod
Given that Ascential do not supply the compiler, they probably won't be able to help if it's a compiler problem. :cry:
But they may know whether you need a patch from the compiler vendor. Have you sought support?

Posted: Mon Feb 14, 2005 5:26 pm
by T42
Whenever you bounce a server, you replace whatever carefully crafted configuration you may have with the newly installed configurations.

Ask yourself this question:

What have changed on the Server?

If you are unable to answer that question, well... I think you will need to ensure that the question WILL be answered next time.

In the meantime...

When you say '1:1' mapping, does that mean that you are doing a simple sequential file -> sequential file job? Or is there a transformer in it? This is an important distinction that we need to make here -- as transformers (and other C-code based stuff such as BuildOPS) require the use of an external compiler, and other stages depends on an internal compiler.

If there's a transformer, there's a problem with the path to the compiler.

If there's no transformer, there's a problem with the path to the PXEngine directory.

Posted: Mon Jun 27, 2005 9:40 am
by estevesm
I also get the same error code. However, if I manually reset the failed job using Director, it runs fine after that.
So I believe it gives this error code if you try to run a failed job even if you specify "reset and run" option in dsjob...