Semaphores / batch queues

rleishman · Post by **rleishman** » Thu Sep 29, 2005 12:35 am

I have a job sequence that kicks off a large number of independent jobs. There are no links between the job activities, because there are no dependencies.

What I want is some mechanism to limit the load placed on the system. Ideally I'd like functionality such as the Unix at/batch/cron command that restricts the number of simultaneous jobs on a given queue. When one job finishes, another starts.

Things I have considered:
* Add links or bust the Job Sequence into multiple linked sequences. Not bad, but sub-optimal use of resources because I won't always have the maximum number of jobs running.
* Increment/decrement a number myself in a hash file - not nice as I don't like my chances of handling aborts.
* I could use a before/after-job-subroutine to acquire/release a semaphore. I've seen some references to semaphores in this forum, but cannot find the right manual to describe them properly. It sounds like I would have to nominate which lock number each job was to use, rather than allocating a pool of semaphores which every job could use.
* I've got some C programs that implement sem_init, sem_wait, and sem_post as executables you can call from a shell. I could call these from a before/after-job-subroutine - not too sure how they would react to aborts.
* I could get the Job Sequence to initiate the jobs via routines and the dsrun command using the Unix batch queue. Ugly, but it would work.

Any ideas for more elegant solutions?

ray.wurlod · Post by **ray.wurlod** » Thu Sep 29, 2005 12:54 am

N parallel streams of job activities in job sequence.

Get the run times of each job from previous run and try to get each stream such that the total run time is as close to the same as each other stream as possible.

kduke · Post by **kduke** » Thu Sep 29, 2005 5:42 am

Ross

Ken Bland posted a set of jobs to do this. I think it was even developed in version 6 or earlier so it should work fine for most posters. It is very well documented and several DSX users actual have this in production.

chulett · Post by **chulett** » Thu Sep 29, 2005 7:23 am

Yup, very nice stuff that we use extensively. In the past his KBA Job Control Utilities have been freely available for download from his website and I imagine they still are.

rleishman · Post by **rleishman** » Thu Sep 29, 2005 4:58 pm

Thank you guys, I'll look into it.

ray.wurlod · Post by **ray.wurlod** » Thu Sep 29, 2005 9:22 pm

I've seen some references to semaphores in this forum, but cannot find the right manual to describe them properly. It sounds like I would have to nominate which lock number each job was to use, rather than allocating a pool of semaphores which every job could use.

Next time you're at a TCL prompt, enter HELP LOCK and HELP CLEAR.LOCKS for the command version, and HELP BASIC LOCK and HELP BASIC UNLOCK for the program variants.

Yes you have to specify explicit semaphore (task synchronization lock) numbers. But you could write pooling routines fairly easily.

If 64 semaphores is insufficient, change the PSEMNUM parameter in uvconfig, stop DataStage, run uvregen, and restart DataStage.

You could also use locks on non-existent records in "UniVerse" files. By this means you have things called "shared locks". HELP BASIC RECORDLOCKL will get you under way, or consult the DataStage BASIC manual. You will also need to know about RECORDLOCKED() function and RELEASE statement if you go this route.