node setting

dsa · Post by **dsa** » Mon Oct 11, 2010 12:45 am

Hi,

suppose I have 6 node configuration. How can I specify for stages in ajob that they have to exceute on how many nodes?

For eample suppose I have 3 stages in a job. How can I specify stage one execute on 3 nodes, stage two execute on 4 nodes and stage 3 executes on 2 nodes.

Sreenivasulu · Post by **Sreenivasulu** » Mon Oct 11, 2010 1:06 am

You can apply different configuration files for different stages in the same job. But not sure whether you do this for all stages (e.g in transformer you can do this)

Regards
Sreeni

dsa · Post by **dsa** » Mon Oct 11, 2010 1:09 am

no no

acually I have only one confugiration file which states it's a six node configuration. Now while desiging the job I have specify for each stage on how many nodes it should run.

ray.wurlod · Post by **ray.wurlod** » Mon Oct 11, 2010 1:13 am

To do that your configuration file will need to name node pools. Once node pools exist, your stages can each be restricted to run in a particular node pool.

dsa · Post by **dsa** » Mon Oct 11, 2010 2:23 am

ray.wurlod wrote:To do that your configuration file will need to name node pools. Once node pools exist, your stages can each be restricted to run in a particular node pool. ...

Do you mean to say we would have to define different node pools with 3 nodes, 4 nodes as well as 2 nodes configurations and select the node pool at job property?

ArndW · Post by **ArndW** » Mon Oct 11, 2010 2:44 am

Sort of. You would declare node pools, then call the job with the configuration file and then limit each stage separately as to which pool it should use.

dsa · Post by **dsa** » Mon Oct 11, 2010 3:04 am

ArndW wrote:Sort of. You would declare node pools, then call the job with the configuration file and then limit each stage separately as to which pool it should use. ...

Thanks!!!!!!!!!!!

but in that case when i want to say that this stage would execute on 3 nodes and I pick the node pool having three nodes, it would execute only on those three nodes which are defined in that node pool. If these nodes are not available, ti would wait for them but won't pick up any available node outside defined node pool right?

Is there any way to let it dynamically pick up any of the available nodes?

ArndW · Post by **ArndW** » Mon Oct 11, 2010 3:06 am

If you are working in a distributed or grid environment and those nodes aren't available the job would fail. The only way for this to work dynamically is to have a program dynamically create a configuration file that is subsequently used by jobs.

dsa · Post by **dsa** » Mon Oct 11, 2010 3:15 am

ArndW wrote:If you are working in a distributed or grid environment and those nodes aren't available the job would fail. The only way for this to work dynamically is to have a program dynamically create a configu ...

Sorry but how can we create it dynamically? using shell scripting?

ArndW · Post by **ArndW** » Mon Oct 11, 2010 3:17 am

Yes, shell scripting would probably be the best way to do this.

mhester · Post by **mhester** » Mon Oct 11, 2010 12:35 pm

ArndW wrote:Yes, shell scripting would probably be the best way to do this. ...

If, and only if, you wanted to create a configuration that had compute nodes that are available for use and nothing more elaborate. Anything beyond this simple task like

"is the compute node in use" or "does the compute node have resources to run my process" etc...

are outside of what can be addressed on this forum and is outside of what you should attempt. How would you ensure that you would start a process on the correct nodes where the export operator would be invoked from if the process writes sequential data? Not an easy task at all.

I am quite certain that a resource manager is what you would be looking for and that would imply a grid configuration.

dsa · Post by **dsa** » Mon Oct 11, 2010 10:51 pm

Thanks for the inputs !!!