Node map constraint
Moderators: chulett, rschirm, roy
Node map constraint
What is the purpose of Node map constraint. When I open the browse option, I have 4 nodes listed. When do I choose a single node vs multiple nodes. Pls help me understand this piece.
-
- Premium Member
- Posts: 138
- Joined: Wed Jul 16, 2008 9:51 pm
- Location: Kolkata
Hi,
The number of nodes mentioned in the APT_CONFIG file will be reflected when you click on the Node Map Constraint.
When you click on "Node3" for example, the data will flow through three nodes. By default if don't specify any node map constraint the data flow is through one node only.
The more the number of nodes and appropriate partitioning methods will improve the performance of the jobs by reducing the runtime.
So when you have large number of records and/or complex transforms in your job design, it is good to use maximum number of nodes.
The number of nodes mentioned in the APT_CONFIG file will be reflected when you click on the Node Map Constraint.
When you click on "Node3" for example, the data will flow through three nodes. By default if don't specify any node map constraint the data flow is through one node only.
The more the number of nodes and appropriate partitioning methods will improve the performance of the jobs by reducing the runtime.
So when you have large number of records and/or complex transforms in your job design, it is good to use maximum number of nodes.
TONY
ETL Manager
Infotrellis India
"Do what you can, with what you have, from where you are and to the best of your abilities."
ETL Manager
Infotrellis India
"Do what you can, with what you have, from where you are and to the best of your abilities."
I have to add a note of caution here. More may not always better. I have seen numerous instances where developers have overconfigured the number of nodes and have saturated the system resulting in throughput problems.
I've also seen cases where too many nodes have caused problems by impacting other important processes that are co-resident on the system.
If there are no resource constraints and the data can be partitioned correctly to take advantage of the new nodes, then you will see an increase in performance when you add nodes.
If you have the ability to do so, I'd recommend running tests on high-impact jobs with different configurations - you'll usually see a "sweet spot" where throughput is maxed out.
I've also seen cases where too many nodes have caused problems by impacting other important processes that are co-resident on the system.
If there are no resource constraints and the data can be partitioned correctly to take advantage of the new nodes, then you will see an increase in performance when you add nodes.
If you have the ability to do so, I'd recommend running tests on high-impact jobs with different configurations - you'll usually see a "sweet spot" where throughput is maxed out.
-
- Premium Member
- Posts: 138
- Joined: Wed Jul 16, 2008 9:51 pm
- Location: Kolkata