what is meaning of node in the datastage

n.parameswara.reddy@accen · Mon Jun 29, 2009 6:41 am

what is meaning of node in the data stage ? We thought that it is a processor, please give us some explanation on this

Thanks
Reddy

miwinter · Post by **miwinter** » Mon Jun 29, 2009 6:45 am

Hi,

You really need to read the documentation to understand this properly, but if you want a throwaway comment about what a "node" is, then in simple terms it is "somewhere to run a process or processes".

chulett · Post by **chulett** » Mon Jun 29, 2009 6:48 am

Welcome.

Node is a logical rather than a physical concept and while the number of CPUs plays a part here it is not equivalent to a CPU. Think of it as the number of "invocations" or "instances" or "copies" of a job to run, each one sharing the workload and transforming their share of the data. They could all be running on separate CPUs or they could all be running on the same one, that's not up to you but rather the underlying O/S.

rsunny · Post by **rsunny** » Mon Feb 07, 2011 2:14 pm

Hi craig,

In genral when we talk about 2 readers/node and no. of nodes are 2 then no. of invocations or copies of a job are 4 each one sharing the workload and transforming their share of the data?

thanks in advance

ray.wurlod · Post by **ray.wurlod** » Tue Feb 08, 2011 12:04 am

The number of nodes is determined absolutely by the number of node names in a syntactically valid parallel execution configuration file.

As noted, a node is a logical construct, associated with a set of resources available when execution occurs on that node.

So, if the job is run using a configuration file that specifies four nodes, then four "copies" of that job execute each processing approximately one quarter of the records in the source data set. [Other factors can mitigate against achieving such even distribution, but these are beyond the scope of the current question.]

Questions about multiple readers per node really belong in a separate thread.