Conductor Node - Section Leader

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
prasad111
Premium Member
Premium Member
Posts: 173
Joined: Fri May 19, 2006 10:53 am

Conductor Node - Section Leader

Post by prasad111 »

Hi,

I have searched in the forum and using this
viewtopic.php?t=121634&highlight=CONDUCTOR_HOSTNAME as my starting point.

Using server1 as a conductor node and use server2 to run all the processes

Code: Select all

{
node "node1"
{
fastname "server2"
pools "conductor"
resource disk "/datastage/Datasets" {pools ""}
resource scratchdisk "/datastage/Scratch" {pools ""}
}
node "node2"
{
fastname "server2"
pools ""
resource disk "/datastage/Datasets" {pools ""}
resource scratchdisk "/datastage/Scratch" {pools ""}
}
} 
With the above configuration and using APT_PM_CONDUCTOR_HOSTNAME=server1 in the job.
When I execute the job in server1, I get the following error

Code: Select all

main_program: Fatal Error: An ORCHESTRATE program must be started on a node
in the configuration file.  This program is running on server1
which is not in the configuration file: /datastage/config/test_sectleader.apt
Could you please anyone let know what am I doing here.

Thanks
Last edited by prasad111 on Thu Feb 18, 2010 10:41 am, edited 1 time in total.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You're failing to mention server1 in your configuration file. I believe (from your description) that you intended to name server1 as the fastname for node1 but named server2 instead.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
prasad111
Premium Member
Premium Member
Posts: 173
Joined: Fri May 19, 2006 10:53 am

Post by prasad111 »

Thanks for the reply,

If I mention server1 as fastname the job runs fine, but I don't require to use the APT_PM_CONDUCTOR_HOSTNAME parameter, so when do we use this parameter?

By setting the fastname as server1, the process PXEngine/bin/osh run both in server1 and server2, is it possible to run all the processes in server2 and use only the startup process for server1[only the phantom], what I am trying to ask is is there a way to use the server2 resources more?

Currently we are using the configuration file:

Code: Select all

{
node "node1"
{
fastname "server1"
pools ""
resource disk "/datastage/Datasets" {pools ""}
resource scratchdisk "/datastage/Scratch" {pools ""}
}
node "node2"
{
fastname "server2"
pools ""
resource disk "/datastage/Datasets" {pools ""}
resource scratchdisk "/datastage/Scratch" {pools ""}
}
} 
which uses 90% CPU on server1 and only 10% of CPU on server2, we are trying to do the load balancing, just elaborate on the job details, there are some sequencer jobs(15 jobs) run throught the day [every 5 min, 15 mins] and the parallel jobs called from the sequencer jobs run from ~30 seconds to ~2 mins [and some jobs creates 5 osh processes and some create 35 osh processes]....
Last edited by prasad111 on Thu Feb 18, 2010 10:42 am, edited 1 time in total.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Take node1 out of the default node pool (the one with "" as its name) - go back, perhaps, to naming a "conductor" node pool.

This will push all processing onto node 2.

To allow a controlled amount of processing to occur on node 1, design some (only a few) stage types to execute in the "conductor" node pool.

You will need the environment variable to prevent DataStage from using the first-named node in the default node pool as the conductor node.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
prasad111
Premium Member
Premium Member
Posts: 173
Joined: Fri May 19, 2006 10:53 am

Post by prasad111 »

Thank you very much for your valuable inputs,

1)After making this change, the processes are running in the server2 (except the start up process which has to be run on server1 as expected). The behavior of this didn't change with or without the parameter APT_PM_CONDUCTOR_HOSTNAME, so we don't have to use this parameter is my understanding correct?

2)I am exploring this option, this might be very useful for specific jobs

3)Is this about the APT_PM_CONDUCTOR_HOSTNAME parameter, is this related to the the above statement(1)
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I can not provide a cogent answer to these questions without seeing the changed configuration file.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
prasad111
Premium Member
Premium Member
Posts: 173
Joined: Fri May 19, 2006 10:53 am

Post by prasad111 »

This is the changed configuration file:

Code: Select all

{
node "node1"
{
fastname "server1"
pools "conductor"
resource disk "/datastage/Datasets" {pools ""}
resource scratchdisk "/datastage/Scratch" {pools ""}
}
node "node2"
{
fastname "server2"
pools ""
resource disk "/datastage/Datasets" {pools ""}
resource scratchdisk "/datastage/Scratch" {pools ""}
}
} 
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

DataStage server is installed only on server1? That might explain why the conductor process executes there - it has to look after logging, which is done (optionally) to the local DataStage repository.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
prasad111
Premium Member
Premium Member
Posts: 173
Joined: Fri May 19, 2006 10:53 am

Post by prasad111 »

Yes datastage is installed only on server1, I didn't see much difference in proccesses when I set the option to store the log in datastage repository(XMETA) or the filesystem(just like 7.5.x version).

For the above configuration file, we don't have to use the APT_PM_CONDUCTOR_HOSTNAME parameter, is this statement correct?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I believe that that is correct in this case.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
prasad111
Premium Member
Premium Member
Posts: 173
Joined: Fri May 19, 2006 10:53 am

Post by prasad111 »

Thanks for all your valuable inputs.
Post Reply