I am working on a process to add to all of our production jobs that call DataStage from UNIX scripts using the dsjob command. Within our architecture we have created projects for each subject area (order/invoice, material management, finance,...). We then use categories to group all the designs that will be called from one UNIX script. A category may contain a single design, or many designs and a controlling design with jcl. We would like to be able to automate the reset process and have come up with the following jcl code. This jcl will then be imported into every project and will be called by a generic UNIX wrapper script. Please feel free to comment (flame suit is on), and nothing will be taken personally.
**********************************************************************************
* This job will be called to reset any jobs within the same category as the
* passed jobname. This is a multi-instance job and must be called with an
* invocation id appended to the dsjob command.
*
* NOTE: 1. field number 3 is the job category field of the DS_JOBS hashfile
* 2. ERRNONE and Dummy are used to prevent failure of this script since it
* will be run before every dsjob call to execute a datastage job.
**********************************************************************************
* Function definition to allow path to DataStage Routine UtilityHashLookup
DefFun UtilityHashLookup(A1,A2,A3) Calling "DSX.UTILITYHASHLOOKUP"
* Call routine to find the passed jobs category within this project
ThisJobsCategory = UtilityHashLookup("DS_JOBS", JOBNAME, 3)
* Establish a list of all jobs within this project
JobList = DSGetProjectInfo(DSJ.JOBLIST)
* Get the number of jobs to drive the following for loop
ListCount = Dcount(JobList, ",")
* Begin for loop to find and, if necessary, reset all jobs within this category
For ListNumber = 1 to ListCount Step 1
* Parse JobList to find each jobname
ParsedJob = Field(JobList, ",", ListNumber)
* Find parsed job category
ParsedCategory = UtilityHashLookup("DS_JOBS", ParsedJob, 3)
* Logic to determine if this parsed job is within the same category as the passed job
If ParsedCategory = ThisJobsCategory Then
* Try to attach to the parsed job. Errnone used to avoid failure with noncomplied jobs
AttachedJob = DSAttachJob(ParsedJob, DSJ.ERRNONE)
* Logic to determine if attach was made to parsed job
If (AttachedJob) Then
* Capture status of last execution
LastRunStatus = DSGetJobInfo(AttachedJob, DSJ.JOBSTATUS)
* Logic to determine if parsed job should be reset
If LastRunStatus = DSJS.RUNFAILED Or LastRunStatus = DSJS.RUNWARN Or LastRunStatus = DSJS.CRASHED Or LastRunStatus = DSJS.STOPPED
Then
* Reset the parsed job and wait for it to finish
Call DSLogInfo(ParsedJob:LastRunStatus, "Reseting job and last run status")
Dummy = DSRunJob(ParsedJob, DSJ.RUNRESET)
Dummy = DSWaitForJob(ParsedJob)
End
* Release job handle
Dummy = DSDetachJob(AttachedJob)
End
End
Next ListNumber
DONE:
Looks OK, except that there's no need to reset for DSJS.RUNWARN (finished with warnings). Jobs that exit with DSJS.VALFAILED, however, do need to be reset.
You don't check to determine that the reset was OK (status should be DSJS.RESET after DSWaitForJob).
:D :D :D Kudos for good documentation! Everyone else take note! :D :D :D
And, if you ever get DS390, using the term JCL for DataStage BASIC code could be misleading. Prefer "Job Control" routine or "before/after subroutine".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thanks for the info Ray, I will make the change from runwarn to valfailed.
I had intentionally left out the status check of the reset. I want this job to never fail since it will be run in every production job. The purpose of this is to allow easier after hours support of production jobs when users may not have access to director or the UNIX command line. I am predicting that this job will only perform a reset maybe one time in a thousand, and in all others it will do absolutely nothing. In the case were it does nothing, I don't want any chance of error and causing a production problem.
Thanks for the recognition on doc., it never hurts to put it in and can be a big time saver in cases where you write something that may only be modified every couple years.
Sorry about the terminology. The whole jcl thing always confuses me when reading the DS doc. (mainframe programmer in past life )
Your code will never perform a reset. I make this assertion based on the axiom that if you've written a handler for something, it will never happen.
Seriously, though, it's a useful utility to have. There are some other things that could be done at the same time, such as cleaning out old files from the &PH& directory, purging logs of old entries (assuming that you don't have auto-purge configured), and cleaning up any temporary files that you have have created in staging areas and the like which are no longer required. You might contemplate adding some or all of these to your routine.
For example, to remove old entries from &PH& you run find (to specify the age) with rm
You were right Ray! That code wouldn't reset anything. There is an error in the DSRunJob and DSWaitForJob, I used the variable ParsedJob rather than the correct AttachedJob. Only took a few minutes of head scratching to find that one .
In director this returns within one second for a no-reset case, and 6 to 10 seconds for a reset. Do you have any suggestions for improved performance, or is this as good as it gets?
Still have to write the shell script wrapper. Hopefully the dsjob interface will not add much time to the process. If this gets over 2-3 seconds it won't be worth the effort.
.
.
.
* Reset the parsed job and wait for it to finish
Call DSLogInfo(ParsedJob:LastRunStatus, "Reseting job and last run status")
Dummy = DSRunJob(ParsedJob, DSJ.RUNRESET)
Dummy = DSWaitForJob(ParsedJob)
SHOULD BE
* Reset the parsed job and wait for it to finish
Call DSLogInfo(ParsedJob:LastRunStatus, "Reseting job and last run status")
Dummy = DSRunJob(AttachedJob, DSJ.RUNRESET)
Dummy = DSWaitForJob(AttachedJob)
6-10 seconds is about as good as it gets, though you don't have to wait for one to finish before you issue the reset request for the next. That is, lose the DSWaitForJob calls, and implement a different mechanism for checking that things are still running or not, checking the exit status once they've finished, then detaching.
The dsjob interlude doesn't add much overhead.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
You would need a complex OR condition to detect all possible job status values that would potentially benefit from being reset. These include 3 (job failed), 13 (validation failed), 97 (stopped) and 99 (other). In the last case, there is no guarantee that the reset will work; you really need to check for that, too.
Ideally, too, you need to pick up the same parameter values with which the job was run when it aborted, so that (for example) the DSN is the same. This can be done with the log interrogation options for dsjob.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
The code you found with Ray's additions would be fine if you know the exact job names at the time of execution. We use controlling jobs to initiate streams of DataStage jobs. All of the particular jobs for a given stream are all saved within the same category(gui folder) in a project, and at any given time new jobs may be added or removed from the stream. I was wanting to create a way to reset any aborted jobs within one of these categories without reseting all aborted jobs in a project. I have not had the time to finish the wrapper script, but it would just be a simple call to dsjob.