Utilization of CPU

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
boppanakrishna
Participant
Posts: 106
Joined: Thu Jul 27, 2006 10:05 pm
Location: Mumbai

Utilization of CPU

Post by boppanakrishna »

hi all,
I want to to specify a job to utilize say 50% of the resource CPU

where to apply the changes ,should i call any environment varaibles...

Thanks in advance
boppana
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You mean limit a job to use no more than 50% of a CPU? No can do.
-craig

"You can never have too many knives" -- Logan Nine Fingers
boppanakrishna
Participant
Posts: 106
Joined: Thu Jul 27, 2006 10:05 pm
Location: Mumbai

Post by boppanakrishna »

chulett wrote:You mean limit a job to use no more than 50% of a CPU? No can do. ...
hi chulett,

I want allocate 50% of my CPU memory to a particular job1 and i want to allocate not morethan 30% of the memory to job2 in a sequencer

is it possible to apply for the two jobs at a time

Thanks In advance
Boppana
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

As Craig noted, its not possible at the level we work. Thats Operating System programming which I doubt you want to get into.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
jhmckeever
Premium Member
Premium Member
Posts: 301
Joined: Thu Jul 14, 2005 10:27 am
Location: Melbourne, Australia
Contact:

Post by jhmckeever »

Allocating physical resources to processes is one of the main functions of an operating system. If you tell us WHAT you want to achieve (rather than HOW you were planning on doing it) we might be able to help? Where did the 50% and 30% figures come from?
<b>John McKeever</b>
Data Migrators
<b><a href="https://www.mettleci.com">MettleCI</a> - DevOps for DataStage</b>
<a href="http://www.datamigrators.com/"><img src="https://www.datamigrators.com/assets/im ... l.png"></a>
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Operating systems govern what processes run on which cpus and what resources are allocated. The most you could do is run jobs under different userids and (if your hardware supports affinity) define cpu affinity for userids. That will at least set aside cpus for different users, but there's no way to manage a ratio of cpu time to a process. You can "nice" or set priorities, but not throttle cpu utilization.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
johnthomas
Participant
Posts: 56
Joined: Mon Oct 16, 2006 7:32 am

Post by johnthomas »

Correct me if am wrong , what i know is nodes are assigned to a processor and we can assign the node where it need to run in the node map constraints . But we may not be able distribute the load as per the requirement (50-30%)
JT
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Nodes are not assigned to a processor in no fashion, no way, no how. This is bad information.

You are creating a logical computer when you create the configuration file. You could configure 100 nodes on a single cpu server if you were so included. You could configure 10 nodes on a 256 cpu server as well. Those 10 nodes would operate on all 256 cpus, but at any given moment you would be using far fewer than that.

The only way to make a process execute on a specified cpu is thru processor affinity, which is a hardware/os configuration/management.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There is no support in DataStage for "affinity" - tying particular processes to particular CPUs. As Ken notes, affinity is an operating system management thing, but you will not know the DataStage processes in advance, so it's almost impossible to do. And they all run the same executable! DataStage generates as many processes as it needs, and relies upon the operating system to balance them over the available CPUs. How fairly or otherwise this is done depends primarily upon the operating system. You can balance the workload by ensuring that your chosen partitioning algorithm (in parallel jobs) distributes rows as evenly as possible over the available nodes, subject to any key-adjacency requirements for data combination, de-duplication and so on, or by your actual design in server jobs. As noted, a processing node in a configuration file is a logical construct only - you can have more or fewer processing nodes than you have CPUs; to get the right number you need to monitor jobs and determine what percentage of the CPU each job, indeed each stage, requires on each node. This is achieved using the Monitor with instances displayed, or with DataStage API functions.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

At the max you can prioritize the process, and hence restricting the utilization of CPU time. But again it is based on the internal process id's. The commands like nice or renice can be used for this. Again not sure how far its possible to automate.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Post Reply