DS JCL Job to reset all jobs in a category, pls. critique

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
inter5566
Premium Member
Premium Member
Posts: 57
Joined: Tue Jun 10, 2003 1:51 pm
Location: US - Midwest

DS JCL Job to reset all jobs in a category, pls. critique

Post by inter5566 »

Morning all,

I am working on a process to add to all of our production jobs that call DataStage from UNIX scripts using the dsjob command. Within our architecture we have created projects for each subject area (order/invoice, material management, finance,...). We then use categories to group all the designs that will be called from one UNIX script. A category may contain a single design, or many designs and a controlling design with jcl. We would like to be able to automate the reset process and have come up with the following jcl code. This jcl will then be imported into every project and will be called by a generic UNIX wrapper script. Please feel free to comment (flame suit is on), and nothing will be taken personally.

Code: Select all

**********************************************************************************
*  This job will be called to reset any jobs within the same category as the 
*  passed jobname.  This is a multi-instance job and must be called with an 
*  invocation id appended to the dsjob command.
*
*  NOTE:  1. field number 3 is the job category field of the DS_JOBS hashfile
*         2. ERRNONE and Dummy are used to prevent failure of this script since it 
*            will be run before every dsjob call to execute a datastage job.
**********************************************************************************

 * Function definition to allow path to DataStage Routine UtilityHashLookup
    DefFun UtilityHashLookup(A1,A2,A3) Calling "DSX.UTILITYHASHLOOKUP"

 * Call routine to find the passed jobs category within this project
    ThisJobsCategory = UtilityHashLookup("DS_JOBS", JOBNAME, 3)

 * Establish a list of all jobs within this project
    JobList = DSGetProjectInfo(DSJ.JOBLIST)

 * Get the number of jobs to drive the following for loop
    ListCount = Dcount(JobList, ",")

 * Begin for loop to find and, if necessary, reset all jobs within this category
    For ListNumber = 1 to ListCount Step 1

     * Parse JobList to find each jobname
        ParsedJob = Field(JobList, ",", ListNumber)                  

     * Find parsed job category
        ParsedCategory = UtilityHashLookup("DS_JOBS", ParsedJob, 3)  

     * Logic to determine if this parsed job is within the same category as the passed job
        If ParsedCategory = ThisJobsCategory Then

         * Try to attach to the parsed job.  Errnone used to avoid failure with noncomplied jobs
            AttachedJob = DSAttachJob(ParsedJob, DSJ.ERRNONE)

         * Logic to determine if attach was made to parsed job
            If (AttachedJob) Then

             * Capture status of last execution
                LastRunStatus = DSGetJobInfo(AttachedJob, DSJ.JOBSTATUS)

             * Logic to determine if parsed job should be reset
                If LastRunStatus = DSJS.RUNFAILED Or LastRunStatus = DSJS.RUNWARN Or LastRunStatus = DSJS.CRASHED Or LastRunStatus = DSJS.STOPPED
                Then

                 * Reset the parsed job and wait for it to finish
                    Call DSLogInfo(ParsedJob:LastRunStatus, "Reseting job and last run status")
                    Dummy = DSRunJob(ParsedJob, DSJ.RUNRESET)
                    Dummy = DSWaitForJob(ParsedJob)

                End

             * Release job handle
                Dummy = DSDetachJob(AttachedJob)

            End

        End

    Next ListNumber

DONE:
Thanks for your time and have a great weekend,

Steve
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Looks OK, except that there's no need to reset for DSJS.RUNWARN (finished with warnings). Jobs that exit with DSJS.VALFAILED, however, do need to be reset.

You don't check to determine that the reset was OK (status should be DSJS.RESET after DSWaitForJob).

:D :D :D Kudos for good documentation! Everyone else take note! :D :D :D

And, if you ever get DS390, using the term JCL for DataStage BASIC code could be misleading. Prefer "Job Control" routine or "before/after subroutine".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
inter5566
Premium Member
Premium Member
Posts: 57
Joined: Tue Jun 10, 2003 1:51 pm
Location: US - Midwest

Post by inter5566 »

Thanks for the info Ray, I will make the change from runwarn to valfailed.

I had intentionally left out the status check of the reset. I want this job to never fail since it will be run in every production job. The purpose of this is to allow easier after hours support of production jobs when users may not have access to director or the UNIX command line. I am predicting that this job will only perform a reset maybe one time in a thousand, and in all others it will do absolutely nothing. In the case were it does nothing, I don't want any chance of error and causing a production problem.

Thanks for the recognition on doc., it never hurts to put it in and can be a big time saver in cases where you write something that may only be modified every couple years.

Sorry about the terminology. The whole jcl thing always confuses me when reading the DS doc. (mainframe programmer in past life :lol: )

Steve
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Your code will never perform a reset. I make this assertion based on the axiom that if you've written a handler for something, it will never happen. :lol:

Seriously, though, it's a useful utility to have. There are some other things that could be done at the same time, such as cleaning out old files from the &PH& directory, purging logs of old entries (assuming that you don't have auto-purge configured), and cleaning up any temporary files that you have have created in staging areas and the like which are no longer required. You might contemplate adding some or all of these to your routine.

For example, to remove old entries from &PH& you run find (to specify the age) with rm

Code: Select all

Call DSExecute("UNIX","find '&PH&' -atime +7 -exec rm {}\;",Output,ExitStatus)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
inter5566
Premium Member
Premium Member
Posts: 57
Joined: Tue Jun 10, 2003 1:51 pm
Location: US - Midwest

Post by inter5566 »

You were right Ray! That code wouldn't reset anything. There is an error in the DSRunJob and DSWaitForJob, I used the variable ParsedJob rather than the correct AttachedJob. Only took a few minutes of head scratching to find that one :lol: .

In director this returns within one second for a no-reset case, and 6 to 10 seconds for a reset. Do you have any suggestions for improved performance, or is this as good as it gets?

Still have to write the shell script wrapper. Hopefully the dsjob interface will not add much time to the process. If this gets over 2-3 seconds it won't be worth the effort.

Correction to above posted code:

Code: Select all

        .
        .
        .
* Reset the parsed job and wait for it to finish
  Call DSLogInfo(ParsedJob:LastRunStatus, "Reseting job and last run status")
  Dummy = DSRunJob(ParsedJob, DSJ.RUNRESET)
  Dummy = DSWaitForJob(ParsedJob)

SHOULD BE

* Reset the parsed job and wait for it to finish
 Call DSLogInfo(ParsedJob:LastRunStatus, "Reseting job and last run status")
 Dummy = DSRunJob(AttachedJob, DSJ.RUNRESET)
 Dummy = DSWaitForJob(AttachedJob)
Steve
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

6-10 seconds is about as good as it gets, though you don't have to wait for one to finish before you issue the reset request for the next. That is, lose the DSWaitForJob calls, and implement a different mechanism for checking that things are still running or not, checking the exit status once they've finished, then detaching.

The dsjob interlude doesn't add much overhead.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
cyh
Participant
Posts: 18
Joined: Tue Jan 20, 2004 3:23 am

Post by cyh »

It sounds GREAT !!

Would you please post the final version DS code and shell wrapper? Thanks a lot !!!!!
cyh
Participant
Posts: 18
Joined: Tue Jan 20, 2004 3:23 am

Post by cyh »

I just find something similar. Use shell to kick off the job, wait for the return status. If status = something, then reset the job ...

A sample code was shown below for your comments.

Code: Select all

runinfo=`dsjob -run jobstatus $PROJECT $JOB 2>&1`
dsjobStatus=$?
if dsjobStatus ??
  dsjob -run -mode RESET $PROJECT $JOB
fi
Can it works ?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You would need a complex OR condition to detect all possible job status values that would potentially benefit from being reset. These include 3 (job failed), 13 (validation failed), 97 (stopped) and 99 (other). In the last case, there is no guarantee that the reset will work; you really need to check for that, too.

Ideally, too, you need to pick up the same parameter values with which the job was run when it aborted, so that (for example) the DSN is the same. This can be done with the log interrogation options for dsjob.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
inter5566
Premium Member
Premium Member
Posts: 57
Joined: Tue Jun 10, 2003 1:51 pm
Location: US - Midwest

Post by inter5566 »

CYH,

The code you found with Ray's additions would be fine if you know the exact job names at the time of execution. We use controlling jobs to initiate streams of DataStage jobs. All of the particular jobs for a given stream are all saved within the same category(gui folder) in a project, and at any given time new jobs may be added or removed from the stream. I was wanting to create a way to reset any aborted jobs within one of these categories without reseting all aborted jobs in a project. I have not had the time to finish the wrapper script, but it would just be a simple call to dsjob.

Steve
Post Reply