Page 1 of 2

Sequence gone haywire

Posted: Wed Mar 19, 2008 3:23 am
by PhilHibbs
I have a Sequence that just runs a load of jobs one after another unconditionally. Last night it went wrong and tried to launch one of the jobs twice, and errrored because "Job is not in the right state (compiled and not running)". Anyone seen this happen before? One of my colleagues added a job to the sequence yesterday, but that's further down the chain, it didn't get that far.

Is it possible that running a job of the same name in another project might confuse the Job Schedule?

I had a similar problem a couple of years ago, it did something like this:
1. Tell the server to start running the job
2. Check that the job is running - no it isn't, ERROR!
3. Job starts running

Just a timing issue I think, but annoying.

Re: Sequence gone haywire

Posted: Wed Mar 19, 2008 7:35 am
by chulett
PhilHibbs wrote:Is it possible that running a job of the same name in another project might confuse the Job Schedule?
Nope. Are you certain the Sequence itself tried to start the job twice? Any chance someone/thing else started it before your Sequence did?

ps. Don't recall ever seeing anything like this.

Posted: Wed Mar 19, 2008 7:42 am
by PhilHibbs
Screenshots of it working ok on Tuesday but going wrong last night:
http://www.hibbs.me.uk/images/sequence_ok.png
http://www.hibbs.me.uk/images/sequence_haywire.png

Posted: Wed Mar 19, 2008 7:52 am
by chulett
Odd... and I really got nothing, sorry. :(

Is this a repeatable behaviour with the modified version of the Sequence or was it a one time thing? I'll guess we may have to wait and see.

Posted: Wed Mar 19, 2008 8:12 am
by PhilHibbs
chulett wrote:Is this a repeatable behaviour with the modified version of the Sequence or was it a one time thing? I'll guess we may have to wait and see.
I'll let you know - we can only run this job after 6pm as it accesses a mainframe system with expensive daytime mips.

Posted: Wed Mar 19, 2008 9:41 am
by DSguru2B
Did you check to see if that particular job ran or any log for that particular day? It seems like the job just refused to obey its master.
Wait and see if the error is repeatable. Then maybe some :idea: will pop up to debug it.
Also, make sure your keeping the &PH& folder in check so that we can rule out the chance of any sort of timeouts.

Posted: Wed Mar 19, 2008 9:48 am
by kumar_s
What was the change exactly?
Was a Job added in that Sequence at the tail, Recompiled the secquence as well as Job and imported into new project?
Or only the Executable got imported?
Wasn't there any change in the Job which the Sequence is complaining?

Posted: Wed Mar 19, 2008 9:54 am
by PhilHibbs
What do you mean by "keep in check"? Delete old files? Check new files for errors?

Posted: Wed Mar 19, 2008 9:57 am
by DSguru2B
Yes. Delete old files. There is no debugging info in those files. They are just phantom files. Most folks have a script that runs daily and deletes all files 7 days and older. If the &PH& file is left unchecked ('without deleting old files'), this effects the startup times of jobs.
Warning: Make sure you delete only old files, please dont accidentally, delete any files for jobs currently running.

Posted: Wed Mar 19, 2008 9:58 am
by PhilHibbs
kumar_s wrote:What was the change exactly?
Was a Job added in that Sequence at the tail, Recompiled the secquence as well as Job and imported into new project?
Or only the Executable got imported?
Wasn't there any change in the Job which the Sequence is complaining?
The change was that another job was included in the sequence, about three or four steps further on in the line than the position that it all went wrong. I don't think there was any change to the job that the Job Sequence trued to run twice. Even if there was, changing a job shouldn't cause the Sequence to run it twice simultaneously when it's only in the sequence once.

Posted: Wed Mar 19, 2008 10:02 am
by kumar_s
You can find the BASIC code in Job Control for that JobSequence. Check if that job is coded to call twice. If so, check which stage is calling it again.

Posted: Wed Mar 19, 2008 10:09 am
by deanwalker
I don't think this ran 1 job twice, it ran 2 jobs at the same time:
MM_COLORS_LFA1_Tbl_EXT &
MM_COLORS_LXX_Extract

Does the 2nd job depend on data being landed by the first job ? - take a look at the specific log for this job, you might get more of a clue.
And see if the sequence has been changed to not wait for the first job before starting the 2nd one.

Posted: Wed Mar 19, 2008 10:27 am
by PhilHibbs
deanwalker wrote:I don't think this ran 1 job twice, it ran 2 jobs at the same time
At 18:10:45 you can see "Waiting for job MM_COLORS_LXX_Extract to start", and then at 18:11:09 it tries to call DSRunJob(MM_COLORS_LXX_Extract ) and fails because it's already running.

None of the jobs depend on each other, they are all source system extracts into flat files. The only reason I sequence them is so that they don't all run at the same time and overload the source system.

Posted: Wed Mar 19, 2008 11:54 am
by deanwalker
On the working log, the 3rd job completes before the 4th is started.
This does not happen in the broken log, - the 3rd job is started,
then the 4th job (...._LXX_Extract) is started immediately,
which is why the 18:10:46 info msg says that it is waiting for 2 jobs.

Posted: Wed Mar 19, 2008 12:02 pm
by chulett
My assumption there was that a Sequencer set to 'All' was added to the sequence job, that's what makes it wait for multiple jobs at once. Nothing about doing that should make it go 'haywire'.