Page 1 of 1

Details of Node and Pool in the configuration File

Posted: Tue Mar 23, 2010 8:18 am
by ReachKumar
Hi,

Can some one explain the terms Node, Pool and the relationship between Nodes and Pools in the Datas=Stage configuration file?

Posted: Tue Mar 23, 2010 11:08 am
by asorrell
A node is a parallel stream running a copy of your job. If your configuration file has 8 nodes configured, then your job will default to running 8 parallel job-streams every time your job processes.

A pool is nothing more than a means of grouping resources for processing purposes, whether it be a disk pool or a node pool. For example you could tag half of your nodes with the name "SmallPool". Then you could run your job and restrict it to use the "SmallPool" nodes, which in this case would reduce the job to only running four parallel processes.

Note: node pools are not commonly used at most sites, more commonly I see different configuration files used to assign different numbers of nodes. Probably because lots of people find Pools confusing.. :-)

Posted: Wed Mar 24, 2010 12:19 am
by ReachKumar
Thanks asorrell .

One more clarification:
If pool is nothing but grouping resources like disk pool then what is resouce disk and scratch disk.

Is disk pool same as resouce disk and scratch disk?
Please explain

Posted: Wed Mar 24, 2010 12:46 am
by zulfi123786
Resource Disk: It's the location where your persistant data is stored like datasets, filesets etc.

Scratch Disk: It's the disk space which is used by datastage to create temporary files as and when needed Ex:Datastage creates temporary files while sorting the data which are cleared out after sort has been performed.

we can group the Resource and Scratch disks into pools.....

Gurus correct me if I am wrong........

Posted: Wed Mar 24, 2010 1:05 am
by ray.wurlod
Node Pools can be subsets of the available nodes. The default node pool (which has the name "" in the configuration file) must include at least one node.

Disk Pools can be subsets or supersets of the disk (resource or scratch) available.

One site where I worked used a 34 node configuration, of which 10 were assigned to processing and 24 were assigned (in a DB2 node pool) to the DB2 stages. This site processed huge volumes of data. At busier times they changed configuration to use 16 for processing and 24 for DB2.