Page 1 of 2

Load Balancing using job Control

Posted: Wed Sep 19, 2007 5:25 am
by psdhali
Hi,

We have around 30 jobs which load data to teradata. When we run all in parallel (not PX plain parallel) we end up with resource issues.

Currently we run them in parts of 5 jobs each in parallel and then next and next.

Does someone has an idea how to write job control/routine for below -
Input -
List of jobs to be run and list of corrosponding param values
Number of jobs to run in parallel "XX"

Flow -
Start "XX" number of jobs and when any one finishes, start the next one and so on. Also need to know if job has finshed OK, otherwise need to send status back afer current running jobs finish (plus maybe DS Log entry).

Reason - This will speed up the time taken for all 30 jobs to finish and will give flexibility to run "XX" jobs in parallel without changing the sequence.

Best Regards
Preetinder

Posted: Wed Sep 19, 2007 7:18 am
by kduke
Ken Bland gave away job control which would do this. Do a search. Craig uses it maybe he can explain it.

Posted: Wed Sep 19, 2007 7:34 am
by chulett
Exactly - don't re-invent the wheel unless you are extremely bored or have time to properly address the challenge. Ken has already done all this hard work for you and gives it away for free. The only 'limitation' I've found is in the parameter handling which works well 99% of the time but occassionally causes us issues. Kim has a more robust scripted parameter approach he's discussed here before which would make it work 100% of the time if I married the two techniques. Some day. :(

Or roll you own. Ken's is quite robust and flexible but you could gen up minimal functionality fairly quickly using a looping structure and arrays for the basic information - job list, status, handles, etc.

Posted: Wed Sep 19, 2007 9:42 am
by psdhali
Thanks a lot for the reply. Can you guide me as to where can I find those utils?

Thanks
Preetinder

Posted: Wed Sep 19, 2007 9:53 am
by chulett

Posted: Wed Sep 19, 2007 9:59 am
by psdhali
Thanks, now I am waiting for access to the website. Looks like they review my profile?? before granting me access to their publications and utilities.

Regards

Posted: Wed Sep 19, 2007 4:45 pm
by ray.wurlod
You can also create a job sequence with N streams of job activities. This will restrict execution to at most N jobs at a time.

Code: Select all

   Job1 ----> Job5 ----> Job9 ----> Job13 ----> Job17
   Job2 ----> Job6 ----> Job10 ----> Job14 ----> Job18
   Job3 ----> Job7 ----> Job11 ----> Job15 ----> Job19
   Job4 ----> Job8 ----> Job12 ----> Job16 ----> Job20

Posted: Wed Sep 19, 2007 4:47 pm
by chulett
Ah... Poor Man's Load Balancing. :wink:

Posted: Wed Sep 19, 2007 4:53 pm
by ArndW
If you can't access Ken's code you can code it in DataStage without too much effort. I have used this kind of resource control extensively, in fact I'm monitoring a job right now that takes the number of instances to run in parallel as a parameter.

You cannot use the Job Activity in a sequencer, since it will issue a WAIT.

Code: Select all

Pseudocode:
DIMension HandleArray(NumberOfParallelRuns)
MAT HandleArray = ''

Loop while things left to do
   Search HandleArray for a non-empty slot
   If found, DSAttachJob() to that slot and fill it with the handle, then execute the job
   If no empty slots
   Then
      Wait a while
      cycle through array, getting Job Status for all elements, any that are now finished are DSDetachJob() and handle set to empty.
      Go back to beginning of loop
   End
Endloop

Posted: Thu Sep 20, 2007 9:10 am
by psdhali
ray.wurlod wrote:You can also create a job sequence with N streams of job activities. This will restrict execution to at most N jobs at a time.

Code: Select all

   Job1 ----> Job5 ----> Job9 ----> Job13 ----> Job17 ...[/quote]

Hi Ray,

 This is what we are doing today, running N jobs in parallel and then waiting for status from all those before starting next N. Idea is to start another one as soon as any one finishes, something like dynamic decision (eg some jobs may finish earlier due to less data)

Posted: Thu Sep 20, 2007 9:59 am
by psdhali
Ok Have started writing it (our developer not me :) ). We already have job control which sets params for a job and runs it. So will change it to use arrays, load array with job names and params and then do the looping part.

Will post it here once done.

Thanks

solution...

Posted: Mon Oct 01, 2007 3:41 am
by kausmone
Hi Everyone,

Thanks for responding to the topic (I am the developer mentioned above :) ).

We have an existing multiple-instance job-control that is calling the actual job, based on a parameter "SUBFILE_RUN_LIST". In the previous environment, this job-control was called multiple times in the 'poor-man's load balancing' described in earlier posts. We have now built a small routine which is reading the list of all possible "SUBFILE_RUN_LIST" values, 21 in all currently, from a sequential file and calling the aforementioned job control.

Here's the code:

#include DSINCLUDE JOBCONTROL.H

DIM SubFileRunListArray(25)
MAT SubFileRunListArray=0

vJobName=''

DIM JobHandleArray(MAX_SIM_RUNS)
MAT JobHandleArray=0

OpenSeq SCRIPT_FOLDER:"/subfile_run_list.txt" To FileVar else
Call DSLogWarn("Cannot open subfile_run_list.txt file and error status is " : status(), '')
End

* read list of jobs to run from text file into an array
ctrJobsToRun=0
Loop
ReadSeq FileLine From FileVar
Else EXIT
ctrJobsToRun=ctrJobsToRun+1
SubFileRunListArray(ctrJobsToRun) = Trim(FileLine)
Repeat

CloseSeq FileVar

i=1
j=1
k=1

Loop Until SubFileRunListArray(i) = 0
For j=1 To MAX_SIM_RUNS Step 1
If JobHandleArray(j)=0
Then
vJobName="JB_Load_SQLoop_SUBFILE.":i
JobHandleArray(j)=DSAttachJob(vJobName,DSJ.ERRWARN)
JobHandleArray(j)=DSPrepareJob(JobHandleArray(j))

ErrCode=DSSetJobLimit(JobHandleArray(j),DSJ.LIMITWARN,0)
ErrCode=DSSetParam(JobHandleArray(j), "TRG_DB_DIRECTOR", TRG_DB_DIRECTOR)
ErrCode=DSSetParam(JobHandleArray(j), "TRG_DB_NAME", TRG_DB_NAME)
ErrCode=DSSetParam(JobHandleArray(j), "TRG_DB_USERID", TRG_DB_USERID)
ErrCode=DSSetParam(JobHandleArray(j), "TRG_DB_PASSWORD", TRG_DB_PASSWORD)
ErrCode=DSSetParam(JobHandleArray(j), "FILE_TYPE_ID", FILE_TYPE_ID)
ErrCode=DSSetParam(JobHandleArray(j), "RECEPTION_FOLDER", RECEPTION_FOLDER)
ErrCode=DSSetParam(JobHandleArray(j), "LOG_FOLDER", LOG_FOLDER)
ErrCode=DSSetParam(JobHandleArray(j), "CSV_FILE_DIR", CSV_FILE_DIR)
ErrCode=DSSetParam(JobHandleArray(j), "ARCH_FOLDER", ARCH_FOLDER)
ErrCode=DSSetParam(JobHandleArray(j), "STAGING_FOLDER", STAGING_FOLDER)
ErrCode=DSSetParam(JobHandleArray(j), "SUBFILE_RUN_LIST", SubFileRunListArray(i))
ErrCode=DSSetParam(JobHandleArray(j), "SCRIPT_FOLDER", SCRIPT_FOLDER)
ErrCode=DSRunJob(JobHandleArray(j),DSJ.RUNNORMAL)
Call DSLogInfo("Running job for ":SubFileRunListArray(i),"GNSJobRunOpt")

i=i+1
End
Next j

For k=1 To MAX_SIM_RUNS Step 1
JobCurrStatus=DSGetJobInfo(JobHandleArray(k),DSJ.JOBSTATUS)
If JobCurrStatus=DSJS.RUNFAILED Or JobCurrStatus=DSJS.CRASHED
Then
Abort
End
If JobCurrStatus=DSJS.RUNOK Or JobCurrStatus=DSJS.RUNWARN
Then
JobHandleArray(k)=0
End
Next k
Repeat



Posted: Mon Oct 01, 2007 4:39 am
by ray.wurlod
You don't handle the situation that there are more than 25 jobs in the file. And where does MAX_SIM_RUNS get set?

Posted: Mon Oct 01, 2007 4:53 am
by kausmone
You are right, just that in this case, i know the total number of jobs. We can pass this as a parameter otherwise, just as the MAX_SIM_RUNS is. MAX_SIM_RUNS is passed to the routine from the calling sequencer. Similarly, the value for number of jobs can be passed as a parameter. Thanks for the input!

Regards,
kaustubh

Posted: Tue Oct 02, 2007 7:42 am
by kausmone
It was observed that the calling routine was Finishing even though the jobs that it had called hadn't completed execution. Added the following code at the end to avoid this, and including it here for the sake of completeness of the solution:

For k=1 To MAX_SIM_RUNS Step 1
JobCurrStatus=DSGetJobInfo(JobHandleArray(k),DSJ.JOBSTATUS)
If JobCurrStatus=DSJS.RUNFAILED Or JobCurrStatus=DSJS.CRASHED
Then
Abort
End
If JobCurrStatus=DSJS.RUNNING
Then
ErrCode=DSWaitForJob(JobHandleArray(k))
End
Next k

Added this after the final "Repeat" statement