Spawning of processes in a DSEE environment

sathyanveshi · Post by **sathyanveshi** » Thu Jun 09, 2005 6:07 am

Hi,

Suppose I have 8 CPUs and I configure 4 nodes out of it, can I assume that 1 node is equivalent to 2 CPUs?

Also, is node a Unix process? Is it a fork() that is being invoked to when the CPUs are converted into nodes?

Cheers,
Mohan

ArndW · Post by **ArndW** » Thu Jun 09, 2005 6:29 am

Sathyanveshi,

the only thing controlling on which CPUs processes get executed on is the OS - DataStage has no control on where a specific orchestrate pid gets thrown. So in your case with a 4-node configuration file and 8 cpus you cannot assume that a node will go to 2 cpus. But nonetheless the processing load will be apportioned to all the available CPUs by UNIX so effectively you can pretend that this is happening.

The number nodes you declare in your configuration tell DataStage into how many distinct parallel threads it needs to split the Job into. Then, depending upon the number and type of stages in the job, these concurrent processing streams are further broken down into separate processes (pids visible in the ps command). But from this point on the operating system will take over and move processes around. When a given "node" process gets swapped out it will not necessarily execute on the same physical CPU when it gets brought back into memory.

The underlying mechanism that UNIX will use to spawn new process is the fork() call.