Hi,
Can any body explain me the following code of the configuration file?
And what will be the difference when "DB2" word is used ??
I am bit slow in understanding the inner logic of the file.
This configuration offers two nodes, only one of which is in the default node pool (the one with "" as its pool name). The other one is in a node pool called "DB2".
Non-DB2 stages will, unless specified otherwise, execute in the default node pool. In your configuration that means they will all run sequentially, since there's only one node in the pool.
DB2 stages will automatically seek out a node pool called "DB2" and execute in that. If there is no "DB2" node pool, they will also execute in the default node pool.
As far as I can see this configuration file is a misguided attempt to separate the DB2 processing from the other processing. The problem is that it has sacrificed all the benefits of parallelism to do so, without any gains in overall processing efficiency since all nodes are on the same machine.
If there were two or more processing (default) nodes, and maybe multiple nodes in the "DB2" node pool corresponding to the number of table partitions, then we might have a different story!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thanx Ray.
You are correct, actually the file is having 12 defult nodes and 12 "DB2" nodes. Just to understand the logic I pasted only a part of the file's logic.
Then according to you, 12 default nodes will be assigned to non-DB2 stages and 12 nodes fot DB2 stages, am I right?
Now in this scenario, am I achieving parellelism?
Thanx Zulfi.
So what do you think in which scenario a normal job will work faster -using 24 default nodes or 24 mixed(12 default and 12 DB2 nodes) ?
Consider a job
chandra.shekhar@tcs.com wrote:
So what do you think in which scenario a normal job will work faster -using 24 default nodes or 24 mixed(12 default and 12 DB2 nodes) ?
What to you say?
To answer the above, there is a lot to say
Increasing the number of nodes on and on wont make your job run faster...... you need to understand your hardware to decide how far you can dwell in parallelism.
Adding too many nodes will increase the overhead of managing the numerous process.
If you are not sure of what lies under the hood then perform a trial and error run to find what is the best number of nodes for optimal performance (which in your case you define as speed of processing), beware that this node count would again depend on the varying load of on the server when the test is performed.
See if you can talk to a site that is using a comparably sized configuration, for example Target Corporation (they have offices in Minneapolis and Bangalore). One of their configurations has 10 processing nodes and 24 DB2 nodes (12 for reading, 12 for writing).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.