Scheduled jobs intermittently don't start
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 483
- Joined: Thu Jun 12, 2003 4:47 pm
- Location: St. Louis, Missouri USA
Scheduled jobs intermittently don't start
I have noticed that occasionally I have a job sequencer that doesn't run when it's scheduled. It runs ok before and after the event where it didn't run. If I look in the Job Log when this happens I see the entry for DS.SCHED where the scheduler fired off, but no entries at all for the job running. If I look in Director, the last run date/time are the last time the job ran successfully. It appears to me that the job just isn't starting at all during this scheduled event. The last time that this happened, I had at least two jobs where this happened, they were both supposed to start at 5:00am, but the job itself didn't run.
Does anyone have any ideas why this might be happening? Or, any ideas on how I can determine why the jobs aren't starting? I did look in the cron log and can see that the process to run the job is called. I also looked in a UNIX system log (I can't remember now what it was called) and didn't see any problems during that time...
Thanks for your help,
Tony
Does anyone have any ideas why this might be happening? Or, any ideas on how I can determine why the jobs aren't starting? I did look in the cron log and can see that the process to run the job is called. I also looked in a UNIX system log (I can't remember now what it was called) and didn't see any problems during that time...
Thanks for your help,
Tony
-
- Premium Member
- Posts: 483
- Joined: Thu Jun 12, 2003 4:47 pm
- Location: St. Louis, Missouri USA
Which creates the cron entry for you.
Tony and I have already 'talked' about this after he found my Oliver posting where we'd suffered from the same problem. In my case, it went as suddenly as it came and Ascential never could come up with an explanation as to why it was happening.
From what I remember, the symptoms were rather odd. The log would have only one new entry in it, the "Starting job xxxx" record. The odd thing was the Status view would still show the information from the previous run, as if it hadn't even tried to start.
Is that what you are seeing, Tony?
Tony and I have already 'talked' about this after he found my Oliver posting where we'd suffered from the same problem. In my case, it went as suddenly as it came and Ascential never could come up with an explanation as to why it was happening.
From what I remember, the symptoms were rather odd. The log would have only one new entry in it, the "Starting job xxxx" record. The odd thing was the Status view would still show the information from the previous run, as if it hadn't even tried to start.
Is that what you are seeing, Tony?
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
-
- Premium Member
- Posts: 483
- Joined: Thu Jun 12, 2003 4:47 pm
- Location: St. Louis, Missouri USA
Craig,
Yes, this is exactly what I'm seeing.
Thanks, Ketfos, if it comes to that we will do that, but I would hate to have to go that route.
Sainath,
Yes, this is a multi-processor system. An HP UNIX system with 8 processors.
My problem, right now, is that I don't know what to look for to troubleshoot this issue any further. We did check the CRON logs and you can see where CRON launched the process that writes the Job Log entry, but the job itself never starts... We even looked at the /var/adm/syslog/syslog.log but there wasn't anything in there at all close to the time that this job was supposed to run.
Thanks everyone,
Tony
Yes, this is exactly what I'm seeing.
Thanks, Ketfos, if it comes to that we will do that, but I would hate to have to go that route.
Sainath,
Yes, this is a multi-processor system. An HP UNIX system with 8 processors.
My problem, right now, is that I don't know what to look for to troubleshoot this issue any further. We did check the CRON logs and you can see where CRON launched the process that writes the Job Log entry, but the job itself never starts... We even looked at the /var/adm/syslog/syslog.log but there wasn't anything in there at all close to the time that this job was supposed to run.
Thanks everyone,
Tony
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
-
- Premium Member
- Posts: 483
- Joined: Thu Jun 12, 2003 4:47 pm
- Location: St. Louis, Missouri USA
No. This is a job sequencer job that is scheduled to start at 5:00am. At 7:30am someone called to my attention that they hadn't receive an email report from this job as they usually did. I checked and what I saw in the Job Log for this Job Sequencer job was the DS.SCHED entry at 5:00am. There was literally nothing after that. I expected to see all the stuff from the job in there after the DS.SCHED entry.
Let me know if you have further questions and I'll try my best to clarify my situation.
Thanks for your help,
Tony
Let me know if you have further questions and I'll try my best to clarify my situation.
Thanks for your help,
Tony
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
The reason for such thing in version 4.1 was because in multi-processor systems, when a job is given a process id and followed by the next job immediately coming up and obtaining the same pid by mistake due to multiple processor, the engine gets confused with the completion of first and assumes it to be the completion of second job.
This leads to the start symbol and not any futher processing.
This leads to the start symbol and not any futher processing.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I've noticed (7.1 on AIX) that the job entries are to be found in atjobs rather than in cronjobs. But this really addresses the "can't schedule" problem rather than the "won't start" problem.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 483
- Joined: Thu Jun 12, 2003 4:47 pm
- Location: St. Louis, Missouri USA
Ray, the only time I've dealt with 'at' was when I scheduled a non-recurring job. The scheduler uses 'at' for non-recurring jobs, rather than 'cron'.I've noticed (7.1 on AIX) that the job entries are to be found in atjobs rather than in cronjobs. But this really addresses the "can't schedule" problem rather than the "won't start" problem.
I did see the entry in the cron log where it started the "dsr_sched.sh" process for that job. I also saw the DS.SCHED entry in the Job Log for that job, but nothing else.
Tony
-
- Premium Member
- Posts: 483
- Joined: Thu Jun 12, 2003 4:47 pm
- Location: St. Louis, Missouri USA