Load Balancing using job Control

kduke · Post by **kduke** » Tue Oct 02, 2007 4:54 pm

I am a little uncomfortable with what we call load balancing. Ken's code is nice but it is a throttle to limit the number of jobs running at one time to x. So if you set x = 6 then only 6 jobs can run at a time. Nice but real load balancing needs a weighted average. I think we as a group can invent something more complicated.

Ideal is using 100% CPU at all times with no paging. Maybe that is 6 jobs maybe 10 sometimes. Depends on the jobs right? How do we get the real % of CPU for a given job? How do we optimize our resources? EtlStats can give you row counts and how long a job runs. It also delivers this into hashed files which could be easily accessed into real load balancing. What would this data structure look like? Lets assume we store % of CPU by job. We need to combine this with run time to stack the jobs into run queues.

Code: Select all

JobA = 20% runs 20 min.
JobB = 50% runs 120 min.
JobC = 25% runs 90 min.
JobD = 25% runs 90 min.

So after 20 min. we need to run JobE and it should be less than 25% of CPU. Same for after 90 min.

Lets build this. I think we need dependencies as well. JobE needs to run after JobA type of stuff.