Datastage Jobs Not Running

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

paranoid
Premium Member
Premium Member
Posts: 185
Joined: Tue May 29, 2007 5:50 am

Datastage Jobs Not Running

Post by paranoid »

Hi,

For the last couple of days we are facing a weird situation which we never experienced before. DS jobs are scheduled using cron utility and unix scripts invoke these DS jobs to run at scheduled time.

But the unix script after processing initial stages(checking the run file etc;) and at the time of invoking the DS jobs, it is failing for no reason.

After some preliminary analysis, we thought of clearing the log information of the jobs and ran the scripts again. Doing so, we could execute the Scripts successfully.

But again today, even after clearing all the old logs of the jobs, the jobs failed for the same reason and we could finish them after re-running those scripts manually.

Do we need to restart the server after clearing the log files?
Or is there any other reason behind these script failures? :(

Please advise.

Thanks in advance.

Sue
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Re: Datastage Jobs Not Running

Post by ray.wurlod »

paranoid wrote:... the jobs failed for the same reason and we could finish them after re-running those scripts manually....
What was the reason?
Are ANY events logged either in DataStage logs or in cron (or script) logs?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
paranoid
Premium Member
Premium Member
Posts: 185
Joined: Tue May 29, 2007 5:50 am

Re: Datastage Jobs Not Running

Post by paranoid »

ray.wurlod wrote:
paranoid wrote:... the jobs failed for the same reason and we could finish them after re-running those scripts manually....
What was the reason?
Are ANY events logged either in DataStage logs or in cron (or script) logs?
Ray,

In the DS job logs, it says that " Resetting the job". Any ideas?
Thanks for your swift response.

Thanks

Sue
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That suggests that you have set "reset if required, then run" in the job activities in your sequence. Does the log later report "job has been reset"?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
paranoid
Premium Member
Premium Member
Posts: 185
Joined: Tue May 29, 2007 5:50 am

Post by paranoid »

Yes Ray,the last message was "Job has been reset".

Thanks
Sue
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Normally, any failure in the execution will be highlighted with a red mark in the log. Do you see any ?

Resetting is one of the feature in the job sequence to ensure correct re-run of previously failed jobs. So they must not matter.

Post any warnings or errors present in the log - especially after initiating your scripts.

Also what were your previous errors for which you cleared your log and reran the jobs successfully ?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

While normally no, you should not have to restart anything after doing something as simple as clearing a job's log, it sounds like that might be prudent in your case given what's going on.
-craig

"You can never have too many knives" -- Logan Nine Fingers
paranoid
Premium Member
Premium Member
Posts: 185
Joined: Tue May 29, 2007 5:50 am

Post by paranoid »

Hi,

Today another 30 jobs failed and when i checked the DS server CPU, it was 0 percent idle.
Is this the reason why these jobs are failing?

Thanks
Sue
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

0% idle can be a cause for jobs to fail due to timeouts or overloaded resources.
But all of this is pure guesswork until you have a nerror message of some type.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

CPU works on timesharing mode. So having CPU 100% loaded may not be the reason. All it says is that your CPU availability maybe the bottleneck to your performance.

Check whether your /tmp is getting full.

Also post the error message(s) in your log. Without that, your guess is the best.
paranoid
Premium Member
Premium Member
Posts: 185
Joined: Tue May 29, 2007 5:50 am

Post by paranoid »

i check the /tmp and it is also 3 percent filled. So i guess it is not an issue.
We are not getting any error status codes in the datastage logs. It is just getting reset by itself. I am wondering what could be the reason?

After 4 AM EST when we try to re-run the scripts manually, they are running fine.It is all happening between 2 - 4 AM EST.

But the weird thing is, the Idle time is still '0' and the jobs are running fine when we run them manaully after 4 AM EST which were failing between 2-4.

:shock:

Any advise??

Thanks

Sue
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Check with your Unix admin team whether they monitor the disk space and directory structures. If they do, ask them for any alerts raised during the hours causing problem.
paranoid
Premium Member
Premium Member
Posts: 185
Joined: Tue May 29, 2007 5:50 am

Post by paranoid »

Sai,

I emailed my UNIX team regarding this. I am eagerly awaiting their inputs.Will post tyou the updates.

Thanks everyone for looking into this issue.

DSxchange rocks :)

Sue
vivekgadwal
Premium Member
Premium Member
Posts: 457
Joined: Tue Sep 25, 2007 4:05 pm

Post by vivekgadwal »

paranoid wrote:i check the /tmp and it is also 3 percent filled. So i guess it is not an issue.
/tmp is 3% at what time? Did you check it at the peak of DataStage processing? Please rule out the possibility of /tmp getting full when you check it at its high water mark.
Vivek Gadwal

Experience is what you get when you didn't get what you wanted
paranoid
Premium Member
Premium Member
Posts: 185
Joined: Tue May 29, 2007 5:50 am

Post by paranoid »

Hi,

The jobs failed today as well and finally i could find the error code when running manually on the server. It says "Status code = -14 DSJE_TIMEOUT".

When i have gone through this forums with this error message, i have found that this error occurs when the server is overloaded.

Any resolution for this?

Thanks
Sue
Post Reply