Config file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
kiran259
Participant
Posts: 48
Joined: Thu Aug 16, 2007 11:17 pm
Location: United States
Contact:

Config file

Post by kiran259 »

Example Config file stated in IBM book.

node "n1"{

fastname "s1"
pool "" "n1" "s1" "app2" "sort"
resource disk "/orch/n1/d1"{}
resource disk "/orch/n1/d2" {"bigdata"}
resource scratchdisk"/temp"{"sort"}
}


node "n2"{

fastname "s2"
pool "" "n2" "s2" "app1" "sort"
resource disk "/orch/n2/d1"{}
resource disk "/orch/n2/d2" {"bigdata"}
resource scratchdisk"/temp"{"sort"}
}

node "n3"{

fastname "s3"
pool "" "n3" "s3" "app1" "sort"
resource disk "/orch/n3/d1"{}
resource scratchdisk"/temp"{}
}

node "n4"{

fastname "s4"
pool "" "n4" "s4" "app1" "sort"
resource disk "/orch/n4/d1"{}
resource scratchdisk"/temp"{}
}

My queries are:
1.Please justify the conductor names and section leaders in this scenario.
2.If last three fastnames are same say"s2",how the conductor scenario changes?
3.Is the "bigdata" specified here the user defined resource pool or reserved name for resource pool?
4.What is the default node pool here and changes happen If I do not specify named pools(n1,s1,app1,sort)?

Please correct me If I had posted any question wrong. :)

Thanks in advance
Kiran
Kiran Vaduguri

As soon as the fear approaches near, attack and destroy it.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

What were you answers?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There is only ever one conductor node. Assuming that APT_CONDUCTOR_NODE environment variable is not set, then the conductor node is the first-named node that is in the default node pool - in this particular case "n1".

There will be one section leader process on each of the nodes mentioned in the configuration file, irrespective of whether any player processes end up executing on those nodes.

"bigdata" is not a reserved disk pool name; therefore we can assume that it is a used-defined disk pool name.

The default node pool is always the one whose name is the zero-length string (that is, ""). You must specify at least one node in the default node pool, though this may be implicit if no pools definition at all occurs. You do not need to specify any other node pools.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kiran259
Participant
Posts: 48
Joined: Thu Aug 16, 2007 11:17 pm
Location: United States
Contact:

Post by kiran259 »

ray.wurlod wrote:There is only ever one conductor node. Assuming that APT_CONDUCTOR_NODE environment variable is not set, then the conductor node is the first-named node that is in the default node pool - in this par ...
So I can conclude there is only one conductor node. However,you can change through env variable.But I haven't come across APT_CONDUCTOR_NODE in DSEE 7.5x2 version guide.I assume that conductor node is the one where DSEE server is installed.I doubt how can APT_CONDUCTOR_NODE changed dynamically.
Kiran Vaduguri

As soon as the fear approaches near, attack and destroy it.
richdhan
Premium Member
Premium Member
Posts: 364
Joined: Thu Feb 12, 2004 12:24 am

Post by richdhan »

Hi,

It is APT_PM_CONDUCTOR_HOSTNAME and not APT_CONDICTOR_NODE.

Did you get answers for other questions?

The default node pool is the one which is marked as "" in the node pool list. By default the parallel engine executes a parallel stage on all nodes defined in the default node pool.

There is one conductor node, one section leader/node. In your example it will be 1 conductor and 4 section leaders.

HTH
--Rich
mystuff
Premium Member
Premium Member
Posts: 200
Joined: Wed Apr 11, 2007 2:06 pm

Re: Config file

Post by mystuff »

kiran259 wrote:Example Config file stated in IBM book.

node "n1"{

fastname "s1"
pool "" "n1" "s1" "app2" "sort"
resource disk "/orch/n1/d1"{}
resource disk "/orch/n1/d2" {"bigdata"}
resource scratchdisk"/temp"{"sort"}
}


node "n2"{

fastname "s2"
pool "" "n2" "s2" "app1" "sort"
resource disk "/orch/n2/d1"{}
resource disk "/orch/n2/d2" {"bigdata"}
resource scratchdisk"/temp"{"sort"}
}

node "n3"{

fastname "s3"
pool "" "n3" "s3" "app1" "sort"
resource disk "/orch/n3/d1"{}
resource scratchdisk"/temp"{}
}

node "n4"{

fastname "s4"
pool "" "n4" "s4" "app1" "sort"
resource disk "/orch/n4/d1"{}
resource scratchdisk"/temp"{}
}
I have few basic questions on config file
a) Whats role does node/disk pools play? I understand the concept of nodes, but don't get the node pool/disk pool
b) In the above example we have "sort" as one of the node. Does it signify that sort operations occurs on this node? Similarly what does app1 means?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Stages can be contrained to execute in node pools. For example, all stages that write to files might be constrained to execute in a node pool called "export". Similarly, disk pools can be used to constrain other kinds of operation. For example if you create a disk pool called "buffer" then all temporary files used by buffer operators will use only those disks. You must have at least one node in the default node pool (the one with "" as its name) and you must have at least one directory in the default disk pool.

Note that the given configuration file is incorrect - you have omitted the word "pools" from everywhere it is needed.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply