Page 1 of 1

Increase number of nodes in CONFIGURATION file

Posted: Wed May 27, 2009 7:55 am
by manjot.chehal
Hi,


I am using a 4 node CONFIGURATION file. I want to increase it to 8 node CONFIGURATION file but i have below questions:
What all i should keep in mind before doing this?
Will it really increase performance of my datatsgae job?
Will it increase the number of datastage jobs which i can run simultaneously?
How should i get to know whether my server is SMP is MPP system?
How configuration file is mapped internally to CPU, I mean how it effects the datastage jobs performance?

Posted: Wed May 27, 2009 7:58 am
by miwinter
Hi,

Interview any time soon?

The Datastage documentation will answer most if not all of these queries - please refer to those. Furthermore, it's such an open-ended question, you should really educate yourself on the subject as a whole.

Re: Increase number of nodes in CONFIGURATION file

Posted: Thu May 28, 2009 8:08 am
by Dave Malia
manjot.chehal wrote:Hi,


I am using a 4 node CONFIGURATION file. I want to increase it to 8 node CONFIGURATION file but i have below questions:
What all i should keep in mind before doing this?
Will it really increase performance of my datatsgae job?
Will it increase the number of datastage jobs which i can run simultaneously?
How should i get to know whether my server is SMP is MPP system?
How configuration file is mapped internally to CPU, I mean how it effects the datastage jobs performance?
1. standard is to match nodes to CPU's (7 cpu, 7 node)
2. what the jobs are doing - small jobs better sequentially run (1 node), large jobs better using multi nodes - based on overheads to start parallel processing.
3. Assuming resources are not fully consumed then yest, parralelism generally improves performance (with point 2 having been considered)
4. Ask your server administrators... or find out what the terms mean..
5. Increasing nodes increases the number of processes per job thus decreasing a jobs duration - it does not allow more jobs to be run simultaniously, quite the opposite i should think as one job is using more resources.
6. dont know what you mean mapped? It effects the performance based on what the jobs doing - just running 4 processes for one job wont help if they are all writing to the same heavily loaded file system...

Good luck in the interview and dont quote me... :o

Posted: Thu May 28, 2009 8:15 am
by miwinter
The answer to the first question is why I replied as I did initially... straight off the bat I don't use/know of a "standard" whereby you run as many nodes as you have CPUs :?

Posted: Thu May 28, 2009 4:38 pm
by ray.wurlod
The wonderful thing about standards is that there are so many from which to choose. I disagree with some of the above, and any answer I gave to the original question would be prefaced with "It depends, but...".

Posted: Thu May 28, 2009 5:24 pm
by chulett
With 'node' being a logical concept there's really nothing tying it to the number of physical CPUs... and certainly no "one to one" standard. As noted, it depends on a number of factors.

Posted: Fri May 29, 2009 2:56 am
by Dave Malia
Totally agree with you all on this subject which is why most of my answers pretty much said it depends on... or words to that effect.

The only way to know what the system can handle and whats a good configuration for the jobs being run is basically to do some performance tests with different configurationsm and some solid analysis.