node setting

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dsa
Participant
Posts: 37
Joined: Sun Oct 10, 2010 7:52 am

node setting

Post by dsa »

Hi,

suppose I have 6 node configuration. How can I specify for stages in ajob that they have to exceute on how many nodes?

For eample suppose I have 3 stages in a job. How can I specify stage one execute on 3 nodes, stage two execute on 4 nodes and stage 3 executes on 2 nodes.
Sreenivasulu
Premium Member
Premium Member
Posts: 892
Joined: Thu Oct 16, 2003 5:18 am

Post by Sreenivasulu »

You can apply different configuration files for different stages in the same job. But not sure whether you do this for all stages (e.g in transformer you can do this)

Regards
Sreeni
dsa
Participant
Posts: 37
Joined: Sun Oct 10, 2010 7:52 am

Post by dsa »

no no

acually I have only one confugiration file which states it's a six node configuration. Now while desiging the job I have specify for each stage on how many nodes it should run.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

To do that your configuration file will need to name node pools. Once node pools exist, your stages can each be restricted to run in a particular node pool.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
dsa
Participant
Posts: 37
Joined: Sun Oct 10, 2010 7:52 am

Post by dsa »

ray.wurlod wrote:To do that your configuration file will need to name node pools. Once node pools exist, your stages can each be restricted to run in a particular node pool. ...
Do you mean to say we would have to define different node pools with 3 nodes, 4 nodes as well as 2 nodes configurations and select the node pool at job property?
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Sort of. You would declare node pools, then call the job with the configuration file and then limit each stage separately as to which pool it should use.
dsa
Participant
Posts: 37
Joined: Sun Oct 10, 2010 7:52 am

Post by dsa »

ArndW wrote:Sort of. You would declare node pools, then call the job with the configuration file and then limit each stage separately as to which pool it should use. ...
Thanks!!!!!!!!!!!

but in that case when i want to say that this stage would execute on 3 nodes and I pick the node pool having three nodes, it would execute only on those three nodes which are defined in that node pool. If these nodes are not available, ti would wait for them but won't pick up any available node outside defined node pool right?

Is there any way to let it dynamically pick up any of the available nodes?
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

If you are working in a distributed or grid environment and those nodes aren't available the job would fail. The only way for this to work dynamically is to have a program dynamically create a configuration file that is subsequently used by jobs.
dsa
Participant
Posts: 37
Joined: Sun Oct 10, 2010 7:52 am

Post by dsa »

ArndW wrote:If you are working in a distributed or grid environment and those nodes aren't available the job would fail. The only way for this to work dynamically is to have a program dynamically create a configu ...

:shock: Sorry but how can we create it dynamically? using shell scripting?
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Yes, shell scripting would probably be the best way to do this.
mhester
Participant
Posts: 622
Joined: Tue Mar 04, 2003 5:26 am
Location: Phoenix, AZ
Contact:

Post by mhester »

ArndW wrote:Yes, shell scripting would probably be the best way to do this. ...
If, and only if, you wanted to create a configuration that had compute nodes that are available for use and nothing more elaborate. Anything beyond this simple task like

"is the compute node in use" or "does the compute node have resources to run my process" etc...

are outside of what can be addressed on this forum and is outside of what you should attempt. How would you ensure that you would start a process on the correct nodes where the export operator would be invoked from if the process writes sequential data? Not an easy task at all.

I am quite certain that a resource manager is what you would be looking for and that would imply a grid configuration.
dsa
Participant
Posts: 37
Joined: Sun Oct 10, 2010 7:52 am

Post by dsa »

Thanks for the inputs !!!
Post Reply