Starting multiple jobs using StartLoop EndLoop in a Sequence

mczern · Post by **mczern** » Thu Aug 18, 2011 12:52 pm

I would like to iterate through a list of files and call a datastage job to process those files simultaneously using the filename as the invocation id. I have created a sequence that does this. However, the loop is currently waiting for the datastage job to complete before moving to the next file in the list. What I'd like to see is N jobs running simultaneously with their invocation id. Is there a way to do this? Is there something in the trigger that I need to set properly?

I had the job activity inline; that did not work. So I tried adding a sequencer that would fork off to the job activity. That does not work either.

Thanks,
Mike

Code: Select all

 ---read_list_file----|
                      |
                      v
      end loop------>loop----set_parms
           |   (for each file)  |
           |                    |
           ^------sequencer-----<
                     |
                     |
                     V
              job_activity 
       (invocation_id: filename from list)
            |                 |
            |                 |
            v                 v
      email_success      email_fail

chulett · Post by **chulett** » Thu Aug 18, 2011 1:23 pm

The Job Activity stage runs a job and then waits for it to complete, that functionality cannot be changed. You'd have to "roll your own", say something like a routine does all of the same steps except the DSWaitForJob() call or a script that leverages dsjob without either of the wait options.

The only issue would be monitoring the status of all the jobs...

mczern · Post by **mczern** » Thu Aug 18, 2011 1:40 pm

Does the Loop have to wait for the Job Activity to finish even if I "fork" the flow to continue? So, File1 MUST complete before File2 is even processed.

The Job Activity does a 'Wait for File' So, I don't mind the Master Sequence waiting for all the children to finish before it finishes; That would be a nice to have to tie up the work package. I just want to have all the jobs waiting for their individual files at the same time since the files will not generally arrive in order. File5 may arrive before File3 which arrives after File1. And it could be several hours/days before any one file arrives. The only information I have is the list of files to process as soon as they arrive.

If you have any suggestions on a better way to implement, I'd greatly appreciate it.

ray.wurlod · Post by **ray.wurlod** » Thu Aug 18, 2011 3:16 pm

Use sub-sequences each with its own WaitForFile activity and Job activity.

kduke · Post by **kduke** » Thu Aug 18, 2011 8:54 pm

Routine is easy to write. You need to do a couple things to get parameters to work unless all jobs have the same parameter list.

When you attach to a job you need to get the valid parameters for that job. Next you can only set a parameter that exists in the job you are about to run. I can probably post some of this code if you need it. I have posted similar code done in a shell script.

mczern · Post by **mczern** » Fri Aug 19, 2011 10:45 am

ray.wurlod wrote:Use sub-sequences each with its own WaitForFile activity and Job activity.

Ray- how would this be different from what I have currently designed? The job i'm calling from the master is a sequence with a Wait For File inside. The problem I'm having is that the master sequence will not invoke the subsequence and then continue to loop without waiting for the subsequence to finish so that I have more than one subsequence running at the same time.

mczern · Post by **mczern** » Fri Aug 19, 2011 10:49 am

kduke wrote:Routine is easy to write. You need to do a couple things to get parameters to work unless all jobs have the same parameter list.

When you attach to a job you need to get the valid parameters for that job. Next you can only set a parameter that exists in the job you are about to run. I can probably post some of this code if you need it. I have posted similar code done in a shell script.

If you can post an example or point me to one of your prior posts that might be helpful. I would have thought this would be a fairly common use for the looping construct.

ray.wurlod · Post by **ray.wurlod** » Fri Aug 19, 2011 4:34 pm

The master sequence always waits for a sub-sequence. You can take the sub-sequence code and turn it into a routine that returns immediately (remove the call to DSWaitForJob), but you would then need somehow to code for waiting for all the sub-sequences to finish.

chulett · Post by **chulett** » Fri Aug 19, 2011 5:49 pm

mczern wrote:I would have thought this would be a fairly common use for the looping construct.

Yah... not so much.

kduke · Post by **kduke** » Fri Aug 19, 2011 8:37 pm

This is all I could find.

viewtopic.php?t=93365
viewtopic.php?t=108490&highlight=-lparams
viewtopic.php?t=109662&highlight=-lparams

Ray is correct on how to build a routine. You could also start with SDK utility run job routine. That might help. Lots of posts on here about using it.

mczern · Post by **mczern** » Wed Aug 24, 2011 7:06 am

Thanks Uncle Kim for the script.. I have stripped it down and added a flag.
Still seems like the Job Activity should have a switch that disables the wait for finish.

Code: Select all

#!/bin/ksh
################################################################################
# File Name    : RunJob.ksh
# Description  : runs DataStage job
# Author       : Mike Czerniawski (3sage Consulting)
# Created      : 08/23/2011
# Modified     :
# Invocation   : Run from command line or from Execute Command sequence call
#                runjob.ksh [wait|no_wait] project job <parm list>
# Prerequisites: None
#
# Change History
# Who              When        What
# ###############  ##########  #################################################
# Kim Duke         06/09/2005  Original Version
#
################################################################################

THIS_SCRIPT="runjob.ksh"

CURRENT_DATE=`date +%Y%m%d`
EXTRACT_DATE=`date +'%Y%m%dT%H%M%S'`
DSHOME=`cat /.dshome`
WAIT_FLG=$1
shift
PROJECT=$1
shift
DSJOB=$1
shift
# -param ParamName=Value
SET_PARAMS="$*"
cd $DSHOME
. ./dsenv

################################################################################
# Check job status here
################################################################################
JOB_STATUS=`dsjob -jobinfo $PROJECT $DSJOB | head -1 | cut -d"(" -f2 | cut -d")" -f1`
echo "Before run JOB_STATUS=$JOB_STATUS"
case $JOB_STATUS in
################################################################################
# 0 "Running"
################################################################################
   0)
      echo "Job $DSJOB already running."
      exit 1
      ;;
################################################################################
# Runnable Job Status (do nothing)
# 1 "Finished"
# 2 "Finished (see log)"
# 9 "Has been reset"
# 11 "Validated OK"
# 12 "Validated (see log)"
# 21 "Has been reset"
# 99 "Compiled"
################################################################################
   1|2|7|9|11|12|21|99)
      :
      ;;
################################################################################
# NOT Runnable Job Status (reset job)
# 0 "Running"
# 3 "Aborted"
# 8 "Failed validation"
# 13 "Failed validation"
# 96 "Aborted"
# 97 "Stopped"
# 98 "Not Compiled"
################################################################################
   *)
      echo "dsjob -run -mode RESET -wait $PROJECT $DSJOB"
      dsjob -run -mode RESET -wait $PROJECT $DSJOB
      RETURN_VALUE=$?
      if [ $RETURN_VALUE -ne 0 ]
      then
         echo "Unable to reset job $DSJOB alreaady running."
         exit $RETURN_VALUE
      fi
esac

################################################################################
# Run job here
################################################################################

if [ $WAIT_FLG -eq no_wait ]
then
   echo "dsjob -run ${SET_PARAMS} $PROJECT $DSJOB"
   dsjob -run ${SET_PARAMS} $PROJECT $DSJOB
else
   echo "dsjob -run -jobstatus -wait ${SET_PARAMS} $PROJECT $DSJOB"
   dsjob -run -jobstatus -wait ${SET_PARAMS} $PROJECT $DSJOB
fi

RETURN_VALUE=$?
if [ $RETURN_VALUE -eq 1  -o $RETURN_VALUE -eq 2 -o $RETURN_VALUE -eq 0 ]
then
   echo job completed successfully
else
   echo "Error: $DSJOB job failed. error code was $RETURN_VALUE"
   echo "Job $DSJOB failed."
   exit $RETURN_VALUE
fi

ray.wurlod · Post by **ray.wurlod** » Wed Aug 24, 2011 3:52 pm

Maybe the logic should handle $RETURN_VALUE -eq 0 separately. This means that the job start request was successful and that the job is still running. You have no idea when it will end.

mczern · Post by **mczern** » Thu Aug 25, 2011 1:43 pm

ray.wurlod wrote:Maybe the logic should handle $RETURN_VALUE -eq 0 separately. This means that the job start request was successful and that the job is still running. You have no idea when it will end.

Good point.

Code: Select all

#!/bin/ksh 
################################################################################ 
# File Name    : RunJob.ksh 
# Description  : runs DataStage job 
# Author       : Mike Czerniawski (3sage Consulting) 
# Created      : 08/23/2011 
# Modified     : 
# Invocation   : Run from command line or from Execute Command sequence call 
#                runjob.ksh [wait|no_wait] project job <parm list> 
# Prerequisites: None 
# 
# Change History 
# Who              When        What 
# ###############  ##########  ################################################# 
# Kim Duke         06/09/2005  Original Version 
# 
################################################################################ 

THIS_SCRIPT="runjob.ksh" 

CURRENT_DATE=`date +%Y%m%d` 
EXTRACT_DATE=`date +'%Y%m%dT%H%M%S'` 
DSHOME=`cat /.dshome` 
WAIT_FLG=$1 
shift 
PROJECT=$1 
shift 
DSJOB=$1 
shift 
# -param ParamName=Value 
SET_PARAMS="$*" 
cd $DSHOME 
. ./dsenv 

################################################################################ 
# Check job status here 
################################################################################ 
JOB_STATUS=`dsjob -jobinfo $PROJECT $DSJOB | head -1 | cut -d"(" -f2 | cut -d")" -f1` 
echo "Before run JOB_STATUS=$JOB_STATUS" 
case $JOB_STATUS in 
################################################################################ 
# 0 "Running" 
################################################################################ 
   0) 
      echo "Job $DSJOB already running." 
      exit 1 
      ;; 
################################################################################ 
# Runnable Job Status (do nothing) 
# 1 "Finished" 
# 2 "Finished (see log)" 
# 9 "Has been reset" 
# 11 "Validated OK" 
# 12 "Validated (see log)" 
# 21 "Has been reset" 
# 99 "Compiled" 
################################################################################ 
   1|2|7|9|11|12|21|99) 
      : 
      ;; 
################################################################################ 
# NOT Runnable Job Status (reset job) 
# 0 "Running" 
# 3 "Aborted" 
# 8 "Failed validation" 
# 13 "Failed validation" 
# 96 "Aborted" 
# 97 "Stopped" 
# 98 "Not Compiled" 
################################################################################ 
   *) 
      echo "dsjob -run -mode RESET -wait $PROJECT $DSJOB" 
      dsjob -run -mode RESET -wait $PROJECT $DSJOB 
      RETURN_VALUE=$? 
      if [ $RETURN_VALUE -ne 0 ] 
      then 
         echo "Unable to reset job $DSJOB alreaady running." 
         exit $RETURN_VALUE 
      fi 
esac 

################################################################################ 
# Run job here 
################################################################################ 

if [ $WAIT_FLG -eq no_wait ] 
then 
   echo "dsjob -run ${SET_PARAMS} $PROJECT $DSJOB" 
   dsjob -run ${SET_PARAMS} $PROJECT $DSJOB 
else 
   echo "dsjob -run -jobstatus -wait ${SET_PARAMS} $PROJECT $DSJOB" 
   dsjob -run -jobstatus -wait ${SET_PARAMS} $PROJECT $DSJOB 
fi 

RETURN_VALUE=$?
case $RETURN_VALUE in
1|2)
   echo job completed successfully
   ;;
0)
   echo "Job $PROJECT $DSJOB running."
   ;;
*)
   echo "Error: $DSJOB job failed. error code was $RETURN_VALUE"
   echo "Job $DSJOB failed."
esac
   exit $RETURN_VALUE