Node configuration

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
bunty
Participant
Posts: 2
Joined: Fri Jun 05, 2009 8:36 am

Node configuration

Post by bunty »

For using Datastage to take benefit of parallel processing, we need to configure multiple nodes in config file, is all the nodes configured are separate physical machines or they are just logical machines?
If they are physical machines do we need to install some Datastage components on all the nodes?
bunty
datskosaraju
Premium Member
Premium Member
Posts: 48
Joined: Tue Nov 25, 2008 11:10 pm
Location: Des Moines,IA

Re: Node configuration

Post by datskosaraju »

bunty wrote:, is all the nodes configured are separate physical machines or they are just logical machines?
can you please elaborate what you meant by that?
"It's easier to go down a hill than up it but the view is much better at the top"
-Bennet,Arnold
bunty
Participant
Posts: 2
Joined: Fri Jun 05, 2009 8:36 am

Re: Node configuration

Post by bunty »

datskosaraju wrote:
bunty wrote:, is all the nodes configured are separate physical machines or they are just logical machines?
can you please elaborate what you meant by that?
Suppose we have our configuration file is configured to use 4 different nodes and we have given different fast names and other settings for the each node and I have one parallel job, which read records from flat file and we have one aggregation stage for doing sum and finally storing the aggregates to some flat file.
we have partitioning logic as round robin, so my question is, is the data is going to be moved to different nodes or where the aggregation will take place means at which node, if the data moves to different nodes based on round robin then which process at that node will do the aggregation like sum.
bunty
kiran259
Participant
Posts: 48
Joined: Thu Aug 16, 2007 11:17 pm
Location: United States
Contact:

Re: Node configuration

Post by kiran259 »

If I say ,in general, about parallel processing in Datastage,it depends on the environment that you use like SMP,Cluster or MPP .One physical node can be divided across multiple logical nodes which you build accordingly in config file of DS.For the scenario asked above,it was already explained in ParallelJob developers guide about different partitionings and methodologies.
Kiran Vaduguri

As soon as the fear approaches near, attack and destroy it.
mukesh.kummar
Participant
Posts: 2
Joined: Wed Mar 26, 2008 4:41 am
Location: Bangalore

Re: Node configuration

Post by mukesh.kummar »

Aggregation would happen at every node , not at any particular node.
In config files node names are logical, node number does not necessarily correspond to number of CPUs. There can be more than one node per CPU depends on situation where tasks are not CPU intensive.
i dont believe in god but I am afraid of Him.
Post Reply