Increase number of nodes in CONFIGURATION file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
manjot.chehal
Participant
Posts: 1
Joined: Fri Jul 27, 2007 2:36 am

Increase number of nodes in CONFIGURATION file

Post by manjot.chehal »

Hi,


I am using a 4 node CONFIGURATION file. I want to increase it to 8 node CONFIGURATION file but i have below questions:
What all i should keep in mind before doing this?
Will it really increase performance of my datatsgae job?
Will it increase the number of datastage jobs which i can run simultaneously?
How should i get to know whether my server is SMP is MPP system?
How configuration file is mapped internally to CPU, I mean how it effects the datastage jobs performance?
miwinter
Participant
Posts: 396
Joined: Thu Jun 22, 2006 7:00 am
Location: England, UK

Post by miwinter »

Hi,

Interview any time soon?

The Datastage documentation will answer most if not all of these queries - please refer to those. Furthermore, it's such an open-ended question, you should really educate yourself on the subject as a whole.
Mark Winter
<i>Nothing appeases a troubled mind more than <b>good</b> music</i>
Dave Malia
Participant
Posts: 7
Joined: Thu May 28, 2009 5:10 am
Location: Plymouth
Contact:

Re: Increase number of nodes in CONFIGURATION file

Post by Dave Malia »

manjot.chehal wrote:Hi,


I am using a 4 node CONFIGURATION file. I want to increase it to 8 node CONFIGURATION file but i have below questions:
What all i should keep in mind before doing this?
Will it really increase performance of my datatsgae job?
Will it increase the number of datastage jobs which i can run simultaneously?
How should i get to know whether my server is SMP is MPP system?
How configuration file is mapped internally to CPU, I mean how it effects the datastage jobs performance?
1. standard is to match nodes to CPU's (7 cpu, 7 node)
2. what the jobs are doing - small jobs better sequentially run (1 node), large jobs better using multi nodes - based on overheads to start parallel processing.
3. Assuming resources are not fully consumed then yest, parralelism generally improves performance (with point 2 having been considered)
4. Ask your server administrators... or find out what the terms mean..
5. Increasing nodes increases the number of processes per job thus decreasing a jobs duration - it does not allow more jobs to be run simultaniously, quite the opposite i should think as one job is using more resources.
6. dont know what you mean mapped? It effects the performance based on what the jobs doing - just running 4 processes for one job wont help if they are all writing to the same heavily loaded file system...

Good luck in the interview and dont quote me... :o
I'm lovin it!!
miwinter
Participant
Posts: 396
Joined: Thu Jun 22, 2006 7:00 am
Location: England, UK

Post by miwinter »

The answer to the first question is why I replied as I did initially... straight off the bat I don't use/know of a "standard" whereby you run as many nodes as you have CPUs :?
Mark Winter
<i>Nothing appeases a troubled mind more than <b>good</b> music</i>
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The wonderful thing about standards is that there are so many from which to choose. I disagree with some of the above, and any answer I gave to the original question would be prefaced with "It depends, but...".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

With 'node' being a logical concept there's really nothing tying it to the number of physical CPUs... and certainly no "one to one" standard. As noted, it depends on a number of factors.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Dave Malia
Participant
Posts: 7
Joined: Thu May 28, 2009 5:10 am
Location: Plymouth
Contact:

Post by Dave Malia »

Totally agree with you all on this subject which is why most of my answers pretty much said it depends on... or words to that effect.

The only way to know what the system can handle and whats a good configuration for the jobs being run is basically to do some performance tests with different configurationsm and some solid analysis.
I'm lovin it!!
Post Reply