Exception Handling Best Practices?

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Exception Handling Best Practices?

Post by chulett »

Having fun with Sequence jobs, part 12. DS 6.0.1r5 on HP/UX, btw.

I have small sequences that use Triggers off of each Job stage to do Email Notification of failures, typically using the 'Otherwise' condition, and they seem to work fine. Using that same methodology for large Sequence jobs seems just... unwieldy, so I was wondering if there was a better way.

I added an Exception Activity to a large sequence and had it trigger the Email Notification. From reading the generated code, it would send the email if *anything* went wrong during the run of the sequence. It sends an email on the initial problem and then only logs all future problems, which is ok I guess. The problem I'm having is, even though the emails from the Job Triggers were able to include job status in the email, this email coming from the exception handler *does not* include any status information, even though that option is checked off in the Email Notification stage.

So, a couple of questions for the group. Is this a bug? Meaning, shouldn't relevant job status information be included in the notification coming from the exception handler? Anyone know if there are any 'macros' or functions that can be leveraged within a generic email message to include anything like Project Name, Job name, etc - anything that could make the email messages more helpful?

On a related note, what kinds of things are people doing for notification? Are you hooking in lots of individual, specific emails or trying to do something more generic? Are you writing your own routines to supplement your jobs ability to cry for help?

Thanks for any insight,

-craig
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Craig

I guess you are the only one with this problem. Sorry.

Kim.

Kim Duke
DsWebMon - Monitor DataStage over the web
www.Duke-Consulting.com
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Kim,

I suspect that Craig's just a bit "further out in front" than I am and I haven't run into this yet. [:)]

Good Luck, Craig. Sorry I can't help you on this.

Tony
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Bummer! [}:)]

Isn't anybody else doing large Sequence jobs that page/email out if there are any problems? Sheesh...
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Craig

Most of my friends haven't had the need. Their jobs never fail. What is your problem. Lol.

Kim.

Kim Duke
DsWebMon - Monitor DataStage over the web
www.Duke-Consulting.com
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

I used the same technique as you, a trigger leading off each job stage on the 'Otherwise' condition that runs a routine. The routine is passed the failed job name as an argument, it then retrieves the last error message from the log for that job and sends it in the email or sms.

I also put in exception handler stage that picks up unhandled errors such as a job not being compiled or a missing routine. These exception errors send through the error description and the DSJobName should send the name of the sequence job that caused the error.

I'd expect exception errors to be rare in a production environment. The email and sms messages are sent from a Unix script and the target support email address are in a settings file.

Vincent McBurney
Data Integration Services
www.intramatix.com
stan_taylor
Charter Member
Charter Member
Posts: 14
Joined: Tue Mar 04, 2003 3:27 pm

Post by stan_taylor »

Also depends on what you call (or want to call) an Exception . . .

If your definition is the same as the one used to trigger a DataStage Exception stage, then great. By my definition, though, I wouldn't want to continue a sequence if I ran into something that would trigger the Exception stage, so I never bothered trying to find out the status of anything that followed.

On the other hand, there are a number of non-fatal and unsuccessful events that I *do* want to know about, with various actions based on the nature of each. For a situation similar to yours where the sequence can continue, I link the all the stages in the sequence to a Sequencer stage. Set the links with the appropriate conditions for each activity, and set the Mode on the Sequencer stage to Any, and use that to trigger a Notification stage. If you have the 'include job status in email' box checked, the Notification will include a summary for *all* the JobActivities in your sequence.

One byproduct of linking every activity to the Sequencer stage is that large sequences will look messy. I consider this a good thing, because it forces the sequences to be smaller. Having smaller sequences, though, means stringing *those* sequences together with yet more sequences, creating a hierarchy of sequences where the leaf nodes are the actual jobs, routines, etc. and the sequences represent logical groupings of job functions. I have tried to keep these organized in such a way as to be independently test-able so that the functionality of any sequence at any level of the hierarchy can also correspond to a certain set of tests (e.g. unit, string, integration, system) - YMMV. I only bring this up because, in handling exceptions, it becomes necessary to propagate exception conditions up the control hierarchy from the different levels. I have created some routines which basically send the appropriate condition to the log, which then propagates up to the controlling job. This allows you to make the decisions at any point in the tree as to how to proceed based on events further down the tree.

Not sure if all that would be considered 'best practices' in DataStage, but at least now you know you aren't the only one thinking about this stuff.

Stan
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Stan

I agree that sequences should be small and run other sequences. I think the issue is 2 links for every job one for failures this makes every job point to the failure path to email someone. This makes for an ugly sequence.

Next is restartablity. Lets say we have a routine or even a TCL process which starts all our jobs. This routine always knows how to abort the sequence after it emails whoever when a job fails. It also knows how to update a hash file with results like row counts and job statuses. These hash files could be used to restart a sequence at the point where it failed. Now you have no failure path in your sequences. It looks clean. It can be restarted at the point of failure. Very powerful. I will build it for a price.

Kim.

Kim Duke
DsWebMon - Monitor DataStage over the web
www.Duke-Consulting.com
stan_taylor
Charter Member
Charter Member
Posts: 14
Joined: Tue Mar 04, 2003 3:27 pm

Post by stan_taylor »

Kim,

- or you could just send me Ken's code and bill me for it. Except we already have it (couldn't resist [:)]).

Yes, I am aware of this - the code is great, very robust and very well-tested. In general, though (and not being a long-time DataStage developer), I would prefer to move more of the functionality out of the code and into the pictures. Sure, it might look messier, but to me it gives a more accurate and complete view of what is really going on. This is particularly true for exceptional conditions, which is what this thread is about.

That said, I don't see any functionality (e.g. recoverability) that using sequences preclude from being implemented. We can discuss how that could be handled in a different thread.

Stan
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Everyone -

Thanks for your replies, they've been very insightful & helpful.

Is there any chance any of you could share your methodology for propogating errors back up a chain of sequences? I'm facing the same issue some of you apparently have solved - series of sequences / sub-sequences where jobs fail, email out and the current sequence stops, but subsequent sequences go merrily on their way because the Sequence itself 'Finished Ok'. [}:)] Appreciate any help you could send my way. Thanks!

-craig

ps. Thanks for the tip on a single 'Any' Sequencer for email, Stan, works like a charm! Don't know why that never occurred to me... [sigh]
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Craig

After the email then you need to run a routine which forces the sequence to fail. The parent sequence will then know that seqeunce has failed. The routine just needs the sequence name. Here is the code.

Call DSLogFail(JobName, "Job Failed")
Ans = ''

Kim.

Kim Duke
DsWebMon - Monitor DataStage over the web
www.Duke-Consulting.com
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Duh - so simple. Thanks! If I had a spare brain cell left I probably would have figured that out, but I appreciate the spoon feeding right now. [xx(]

-craig
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The parent job in the sequence may or may not get to hear about it, depending on the second argument value to DSAttachJob() that is generated. Check out the on-line help for DSAttachJob to see what I mean: what the difference is between DSJ.ERRFATAL, DSJ.ERRWARN and DSJ.ERRNONE.

Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Hmmm... know all about the attachment options for error handling. What I don't know is what code the Sequence jobs generate and if there is any way to control that aspect of them. Have to check it out...

-craig
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The default for a job sequence is DSAttachJob(jobname,DSJ.ERRNONE).
This prevents propagation of exit status to parent, asserting that the parent's job control code will handle it.
Post Reply