Page 1 of 1

IPC & Manageing resource contention

Posted: Mon Apr 30, 2007 7:25 am
by pajj
We have an 8 node, dual core server. Assuming:
a) no other jobs are running
c) we have a simple passive-active-passive stage structure in our test job
d) IPC is enabled

How many processes will be started for the test job?

Assuming the first test job is running, if a second test job (test2) with the similar stucture is submitted, what is the impact to test2 with regards to server resoruces?

tks

Posted: Mon Apr 30, 2007 7:44 am
by DSguru2B
Each active stage will create a process. With multiple jobs running, the resouces will be shared by the jobs and the OS kernal will decide how much to provide to what process.

Posted: Mon Apr 30, 2007 7:58 am
by pajj
So with 8 dual node cpus, there are 16 processes available?

Will the server dynamically adjust the number of processes when the second job is submitted or will test1 keep all the processes until it is finished?

Posted: Mon Apr 30, 2007 8:04 am
by DSguru2B
It depends upon how your kernal is set up. Ideally, it should invoke the second processor. To get a better understanding of this, I suggest you sit with your SA and ask him/her at the OS level how processes are managed.

Posted: Mon Apr 30, 2007 3:25 pm
by ray.wurlod
pajj wrote:So with 8 dual node cpus, there are 16 processes available?

Will the server dynamically adjust the number of processes when the second job is submitted or will test1 keep all the processes until it is finished?
You're operating on an incorrect assumption. The number of processes available is not governed by the number of CPUs - as proof just look at the Processes tab of Task Manager on your PC - there are probable a couple of hundred processes even thought there's only one CPU.

One of the functions of the operating system is to allocate each process that needs CPU a share of the available time. With most applications (including DataStage) you do not control which CPU gets which task - through its life a particular process may execute on any subset of the available CPUs - though only on one at a time unless it has multiple threads.

The whole thing is automatic and you don't need to worry about a thing until the point when there is too much demand for CPU (and other resources), at which point you back off a little on the demand side.

Operating system tuning is a complex art, and one that takes considerable time to learn to do well. It is definitely not something that can be taught through a DataStage forum.

Posted: Mon Nov 12, 2007 2:40 pm
by nkln@you
In our project, we are trying to include IPC stage in each job. is it a good approach? Assuming i have 40 jobs, will there be any resource contention if all the 40 jobs are running in parallel? and if the volume of data in a job is just say 10,000, should we include IPC or a job without transformers is fine?

Posted: Mon Nov 12, 2007 3:00 pm
by chulett
Erk. Not an approach I would take. We have a couple thousand jobs and I can count on one mutated hand how many have IPC stages in them.

Posted: Mon Nov 12, 2007 3:13 pm
by nkln@you
So i can safely assume that bringing IPC stage in each job will definetely kill the CPU in the sense no. of processes that would be created will be too high?
Regarding the job with 10,000 rows, is it safe to go for IPC or just a direct link from Source to target Oci stage will do?

Posted: Tue Nov 13, 2007 11:48 am
by nkln@you
Can anyone Please confirm the above statement that" using IPC in every job would be very bad as the no. of processes created will be very high and that can bring down the DS performance as a whole?

Posted: Tue Nov 13, 2007 4:01 pm
by ray.wurlod
Putting an IPC stage in every job will indeed create extra processes - that is its function. As for "kill the CPU" that depends on total load, but other posts from you and from your site suggest that you are doing that already.

In short - everything in moderation. I suspect someone went on a class or read in a manual that "IPC stage will improve performance" and made a decision to include them everywhere. Garfield once famously asserted "diet kitty munchies? 20 packets of them and I should be thin as a rake!".

Posted: Tue Nov 13, 2007 9:12 pm
by nkln@you
ray.wurlod wrote:but other posts from you and from your site suggest that you are doing that already.
".
I am not doing that currently but thats an idea which is flowing in my project and i am totally against it. So wanted few points to prove that using IPC in every job is not a good approach. Can you Point me to any manual so that i will have the concrete proof.

Posted: Tue Nov 13, 2007 10:15 pm
by chulett
There's no manual that will say "don't add one of these to every job". The Inter-Process Stages are documented in the Server Job Developer's Guide in Chapter 12, but all that's going to tell you how much of a wonderful job it will do speeding all your jobs up.

The only way you'll get "concrete proof" is to make the change and observe the results.