Load Balancing using job Control
Moderators: chulett, rschirm, roy
Load Balancing using job Control
Hi,
We have around 30 jobs which load data to teradata. When we run all in parallel (not PX plain parallel) we end up with resource issues.
Currently we run them in parts of 5 jobs each in parallel and then next and next.
Does someone has an idea how to write job control/routine for below -
Input -
List of jobs to be run and list of corrosponding param values
Number of jobs to run in parallel "XX"
Flow -
Start "XX" number of jobs and when any one finishes, start the next one and so on. Also need to know if job has finshed OK, otherwise need to send status back afer current running jobs finish (plus maybe DS Log entry).
Reason - This will speed up the time taken for all 30 jobs to finish and will give flexibility to run "XX" jobs in parallel without changing the sequence.
Best Regards
Preetinder
We have around 30 jobs which load data to teradata. When we run all in parallel (not PX plain parallel) we end up with resource issues.
Currently we run them in parts of 5 jobs each in parallel and then next and next.
Does someone has an idea how to write job control/routine for below -
Input -
List of jobs to be run and list of corrosponding param values
Number of jobs to run in parallel "XX"
Flow -
Start "XX" number of jobs and when any one finishes, start the next one and so on. Also need to know if job has finshed OK, otherwise need to send status back afer current running jobs finish (plus maybe DS Log entry).
Reason - This will speed up the time taken for all 30 jobs to finish and will give flexibility to run "XX" jobs in parallel without changing the sequence.
Best Regards
Preetinder
Exactly - don't re-invent the wheel unless you are extremely bored or have time to properly address the challenge. Ken has already done all this hard work for you and gives it away for free. The only 'limitation' I've found is in the parameter handling which works well 99% of the time but occassionally causes us issues. Kim has a more robust scripted parameter approach he's discussed here before which would make it work 100% of the time if I married the two techniques. Some day.
Or roll you own. Ken's is quite robust and flexible but you could gen up minimal functionality fairly quickly using a looping structure and arrays for the basic information - job list, status, handles, etc.
Or roll you own. Ken's is quite robust and flexible but you could gen up minimal functionality fairly quickly using a looping structure and arrays for the basic information - job list, status, handles, etc.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Ken's website: http://www.kennethbland.com/
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
You can also create a job sequence with N streams of job activities. This will restrict execution to at most N jobs at a time.
Code: Select all
Job1 ----> Job5 ----> Job9 ----> Job13 ----> Job17
Job2 ----> Job6 ----> Job10 ----> Job14 ----> Job18
Job3 ----> Job7 ----> Job11 ----> Job15 ----> Job19
Job4 ----> Job8 ----> Job12 ----> Job16 ----> Job20
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
If you can't access Ken's code you can code it in DataStage without too much effort. I have used this kind of resource control extensively, in fact I'm monitoring a job right now that takes the number of instances to run in parallel as a parameter.
You cannot use the Job Activity in a sequencer, since it will issue a WAIT.
You cannot use the Job Activity in a sequencer, since it will issue a WAIT.
Code: Select all
Pseudocode:
DIMension HandleArray(NumberOfParallelRuns)
MAT HandleArray = ''
Loop while things left to do
Search HandleArray for a non-empty slot
If found, DSAttachJob() to that slot and fill it with the handle, then execute the job
If no empty slots
Then
Wait a while
cycle through array, getting Job Status for all elements, any that are now finished are DSDetachJob() and handle set to empty.
Go back to beginning of loop
End
Endloop
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
ray.wurlod wrote:You can also create a job sequence with N streams of job activities. This will restrict execution to at most N jobs at a time.Code: Select all
Job1 ----> Job5 ----> Job9 ----> Job13 ----> Job17 ...[/quote] Hi Ray, This is what we are doing today, running N jobs in parallel and then waiting for status from all those before starting next N. Idea is to start another one as soon as any one finishes, something like dynamic decision (eg some jobs may finish earlier due to less data)
solution...
Hi Everyone,
Thanks for responding to the topic (I am the developer mentioned above ).
We have an existing multiple-instance job-control that is calling the actual job, based on a parameter "SUBFILE_RUN_LIST". In the previous environment, this job-control was called multiple times in the 'poor-man's load balancing' described in earlier posts. We have now built a small routine which is reading the list of all possible "SUBFILE_RUN_LIST" values, 21 in all currently, from a sequential file and calling the aforementioned job control.
Here's the code:
#include DSINCLUDE JOBCONTROL.H
DIM SubFileRunListArray(25)
MAT SubFileRunListArray=0
vJobName=''
DIM JobHandleArray(MAX_SIM_RUNS)
MAT JobHandleArray=0
OpenSeq SCRIPT_FOLDER:"/subfile_run_list.txt" To FileVar else
Call DSLogWarn("Cannot open subfile_run_list.txt file and error status is " : status(), '')
End
* read list of jobs to run from text file into an array
ctrJobsToRun=0
Loop
ReadSeq FileLine From FileVar
Else EXIT
ctrJobsToRun=ctrJobsToRun+1
SubFileRunListArray(ctrJobsToRun) = Trim(FileLine)
Repeat
CloseSeq FileVar
i=1
j=1
k=1
Loop Until SubFileRunListArray(i) = 0
For j=1 To MAX_SIM_RUNS Step 1
If JobHandleArray(j)=0
Then
vJobName="JB_Load_SQLoop_SUBFILE.":i
JobHandleArray(j)=DSAttachJob(vJobName,DSJ.ERRWARN)
JobHandleArray(j)=DSPrepareJob(JobHandleArray(j))
ErrCode=DSSetJobLimit(JobHandleArray(j),DSJ.LIMITWARN,0)
ErrCode=DSSetParam(JobHandleArray(j), "TRG_DB_DIRECTOR", TRG_DB_DIRECTOR)
ErrCode=DSSetParam(JobHandleArray(j), "TRG_DB_NAME", TRG_DB_NAME)
ErrCode=DSSetParam(JobHandleArray(j), "TRG_DB_USERID", TRG_DB_USERID)
ErrCode=DSSetParam(JobHandleArray(j), "TRG_DB_PASSWORD", TRG_DB_PASSWORD)
ErrCode=DSSetParam(JobHandleArray(j), "FILE_TYPE_ID", FILE_TYPE_ID)
ErrCode=DSSetParam(JobHandleArray(j), "RECEPTION_FOLDER", RECEPTION_FOLDER)
ErrCode=DSSetParam(JobHandleArray(j), "LOG_FOLDER", LOG_FOLDER)
ErrCode=DSSetParam(JobHandleArray(j), "CSV_FILE_DIR", CSV_FILE_DIR)
ErrCode=DSSetParam(JobHandleArray(j), "ARCH_FOLDER", ARCH_FOLDER)
ErrCode=DSSetParam(JobHandleArray(j), "STAGING_FOLDER", STAGING_FOLDER)
ErrCode=DSSetParam(JobHandleArray(j), "SUBFILE_RUN_LIST", SubFileRunListArray(i))
ErrCode=DSSetParam(JobHandleArray(j), "SCRIPT_FOLDER", SCRIPT_FOLDER)
ErrCode=DSRunJob(JobHandleArray(j),DSJ.RUNNORMAL)
Call DSLogInfo("Running job for ":SubFileRunListArray(i),"GNSJobRunOpt")
i=i+1
End
Next j
For k=1 To MAX_SIM_RUNS Step 1
JobCurrStatus=DSGetJobInfo(JobHandleArray(k),DSJ.JOBSTATUS)
If JobCurrStatus=DSJS.RUNFAILED Or JobCurrStatus=DSJS.CRASHED
Then
Abort
End
If JobCurrStatus=DSJS.RUNOK Or JobCurrStatus=DSJS.RUNWARN
Then
JobHandleArray(k)=0
End
Next k
Repeat
Thanks for responding to the topic (I am the developer mentioned above ).
We have an existing multiple-instance job-control that is calling the actual job, based on a parameter "SUBFILE_RUN_LIST". In the previous environment, this job-control was called multiple times in the 'poor-man's load balancing' described in earlier posts. We have now built a small routine which is reading the list of all possible "SUBFILE_RUN_LIST" values, 21 in all currently, from a sequential file and calling the aforementioned job control.
Here's the code:
#include DSINCLUDE JOBCONTROL.H
DIM SubFileRunListArray(25)
MAT SubFileRunListArray=0
vJobName=''
DIM JobHandleArray(MAX_SIM_RUNS)
MAT JobHandleArray=0
OpenSeq SCRIPT_FOLDER:"/subfile_run_list.txt" To FileVar else
Call DSLogWarn("Cannot open subfile_run_list.txt file and error status is " : status(), '')
End
* read list of jobs to run from text file into an array
ctrJobsToRun=0
Loop
ReadSeq FileLine From FileVar
Else EXIT
ctrJobsToRun=ctrJobsToRun+1
SubFileRunListArray(ctrJobsToRun) = Trim(FileLine)
Repeat
CloseSeq FileVar
i=1
j=1
k=1
Loop Until SubFileRunListArray(i) = 0
For j=1 To MAX_SIM_RUNS Step 1
If JobHandleArray(j)=0
Then
vJobName="JB_Load_SQLoop_SUBFILE.":i
JobHandleArray(j)=DSAttachJob(vJobName,DSJ.ERRWARN)
JobHandleArray(j)=DSPrepareJob(JobHandleArray(j))
ErrCode=DSSetJobLimit(JobHandleArray(j),DSJ.LIMITWARN,0)
ErrCode=DSSetParam(JobHandleArray(j), "TRG_DB_DIRECTOR", TRG_DB_DIRECTOR)
ErrCode=DSSetParam(JobHandleArray(j), "TRG_DB_NAME", TRG_DB_NAME)
ErrCode=DSSetParam(JobHandleArray(j), "TRG_DB_USERID", TRG_DB_USERID)
ErrCode=DSSetParam(JobHandleArray(j), "TRG_DB_PASSWORD", TRG_DB_PASSWORD)
ErrCode=DSSetParam(JobHandleArray(j), "FILE_TYPE_ID", FILE_TYPE_ID)
ErrCode=DSSetParam(JobHandleArray(j), "RECEPTION_FOLDER", RECEPTION_FOLDER)
ErrCode=DSSetParam(JobHandleArray(j), "LOG_FOLDER", LOG_FOLDER)
ErrCode=DSSetParam(JobHandleArray(j), "CSV_FILE_DIR", CSV_FILE_DIR)
ErrCode=DSSetParam(JobHandleArray(j), "ARCH_FOLDER", ARCH_FOLDER)
ErrCode=DSSetParam(JobHandleArray(j), "STAGING_FOLDER", STAGING_FOLDER)
ErrCode=DSSetParam(JobHandleArray(j), "SUBFILE_RUN_LIST", SubFileRunListArray(i))
ErrCode=DSSetParam(JobHandleArray(j), "SCRIPT_FOLDER", SCRIPT_FOLDER)
ErrCode=DSRunJob(JobHandleArray(j),DSJ.RUNNORMAL)
Call DSLogInfo("Running job for ":SubFileRunListArray(i),"GNSJobRunOpt")
i=i+1
End
Next j
For k=1 To MAX_SIM_RUNS Step 1
JobCurrStatus=DSGetJobInfo(JobHandleArray(k),DSJ.JOBSTATUS)
If JobCurrStatus=DSJS.RUNFAILED Or JobCurrStatus=DSJS.CRASHED
Then
Abort
End
If JobCurrStatus=DSJS.RUNOK Or JobCurrStatus=DSJS.RUNWARN
Then
JobHandleArray(k)=0
End
Next k
Repeat
Last edited by kausmone on Mon Oct 01, 2007 4:49 am, edited 1 time in total.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
You are right, just that in this case, i know the total number of jobs. We can pass this as a parameter otherwise, just as the MAX_SIM_RUNS is. MAX_SIM_RUNS is passed to the routine from the calling sequencer. Similarly, the value for number of jobs can be passed as a parameter. Thanks for the input!
Regards,
kaustubh
Regards,
kaustubh
It was observed that the calling routine was Finishing even though the jobs that it had called hadn't completed execution. Added the following code at the end to avoid this, and including it here for the sake of completeness of the solution:
For k=1 To MAX_SIM_RUNS Step 1
JobCurrStatus=DSGetJobInfo(JobHandleArray(k),DSJ.JOBSTATUS)
If JobCurrStatus=DSJS.RUNFAILED Or JobCurrStatus=DSJS.CRASHED
Then
Abort
End
If JobCurrStatus=DSJS.RUNNING
Then
ErrCode=DSWaitForJob(JobHandleArray(k))
End
Next k
Added this after the final "Repeat" statement
For k=1 To MAX_SIM_RUNS Step 1
JobCurrStatus=DSGetJobInfo(JobHandleArray(k),DSJ.JOBSTATUS)
If JobCurrStatus=DSJS.RUNFAILED Or JobCurrStatus=DSJS.CRASHED
Then
Abort
End
If JobCurrStatus=DSJS.RUNNING
Then
ErrCode=DSWaitForJob(JobHandleArray(k))
End
Next k
Added this after the final "Repeat" statement
Last edited by kausmone on Wed Oct 03, 2007 2:09 am, edited 1 time in total.