Page 1 of 1

what is meaning of node in the datastage

Posted: Mon Jun 29, 2009 6:41 am
by n.parameswara.reddy@accen
what is meaning of node in the data stage ? We thought that it is a processor, please give us some explanation on this

Thanks
Reddy

Posted: Mon Jun 29, 2009 6:45 am
by miwinter
Hi,

You really need to read the documentation to understand this properly, but if you want a throwaway comment about what a "node" is, then in simple terms it is "somewhere to run a process or processes".

Posted: Mon Jun 29, 2009 6:48 am
by chulett
Welcome.

Node is a logical rather than a physical concept and while the number of CPUs plays a part here it is not equivalent to a CPU. Think of it as the number of "invocations" or "instances" or "copies" of a job to run, each one sharing the workload and transforming their share of the data. They could all be running on separate CPUs or they could all be running on the same one, that's not up to you but rather the underlying O/S.

Posted: Mon Feb 07, 2011 2:14 pm
by rsunny
Hi craig,

In genral when we talk about 2 readers/node and no. of nodes are 2 then no. of invocations or copies of a job are 4 each one sharing the workload and transforming their share of the data?

thanks in advance

Posted: Tue Feb 08, 2011 12:04 am
by ray.wurlod
The number of nodes is determined absolutely by the number of node names in a syntactically valid parallel execution configuration file.

As noted, a node is a logical construct, associated with a set of resources available when execution occurs on that node.

So, if the job is run using a configuration file that specifies four nodes, then four "copies" of that job execute each processing approximately one quarter of the records in the source data set. [Other factors can mitigate against achieving such even distribution, but these are beyond the scope of the current question.]

Questions about multiple readers per node really belong in a separate thread.