Parallel Startup Failed Error, No Child Processes.
Moderators: chulett, rschirm, roy
Parallel Startup Failed Error, No Child Processes.
Hi,
I have a set of around 74 K records in a dataset, which i have to insert into a new table. When i try to run a parallel job, that has a Dataset(source), Transformer and a target Oracle table, I get the following errors from the director:
main_program: APT_PMConnectionRecord::start: waitpid(18708, 0, 0) returned -1, No child processes
main_program: **** Parallel startup failed ****
This is usually due to a configuration error, such as
not having the Orchestrate install directory properly
mounted on all nodes, rsh permissions not correctly
set (via /etc/hosts.equiv or .rhosts), or running from
a directory that is not mounted on all nodes. Look for
error messages in the preceding output.
main_program: A startup script is not being used.
main_program: Unable to contact one or more Section Leaders.
Probable configuration problem; contact Orchestrate system administrator.
For the job, i have four nodes defined in the configuration file.
Interestingly, i am able to create a table and load 5 million records using the same configuration file.
Any Idea what could be the reason for such a problem to occur?
I have a set of around 74 K records in a dataset, which i have to insert into a new table. When i try to run a parallel job, that has a Dataset(source), Transformer and a target Oracle table, I get the following errors from the director:
main_program: APT_PMConnectionRecord::start: waitpid(18708, 0, 0) returned -1, No child processes
main_program: **** Parallel startup failed ****
This is usually due to a configuration error, such as
not having the Orchestrate install directory properly
mounted on all nodes, rsh permissions not correctly
set (via /etc/hosts.equiv or .rhosts), or running from
a directory that is not mounted on all nodes. Look for
error messages in the preceding output.
main_program: A startup script is not being used.
main_program: Unable to contact one or more Section Leaders.
Probable configuration problem; contact Orchestrate system administrator.
For the job, i have four nodes defined in the configuration file.
Interestingly, i am able to create a table and load 5 million records using the same configuration file.
Any Idea what could be the reason for such a problem to occur?
Hi,
if you'll supply full configuration of your system,
it might help people to give an answer.
if you'll supply full configuration of your system,
it might help people to give an answer.
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Hi Ray,
Can you please shed some more light on this.
Actually,
I am able to load the same dataset into a sequential file.
When i try to load the table using either sequential file or dataset, i hit this error.
The Parallel startup is not happening while loading into the table only.
Thanks in anticipation
Can you please shed some more light on this.
Actually,
I am able to load the same dataset into a sequential file.
When i try to load the table using either sequential file or dataset, i hit this error.
The Parallel startup is not happening while loading into the table only.
Thanks in anticipation
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The process table is a low level component in UNIX (think of the pid as a row number in this table). There is one entry associated with each process; it's used to keep track of timeslice, execution priority, and so on.
The fact that you can execute when not writing to a table seems to exonerate the process table as a candidate cause. Hence my original opening "it's rare".
Have you checked all the things mentioned in the error message? This seems to suggest an incomplete installation or incomplete configuration of the underlying "Orchestrate" engine, or insufficient remote execution access/privileges.
The fact that you can execute when not writing to a table seems to exonerate the process table as a candidate cause. Hence my original opening "it's rare".
Have you checked all the things mentioned in the error message? This seems to suggest an incomplete installation or incomplete configuration of the underlying "Orchestrate" engine, or insufficient remote execution access/privileges.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Hi T42,
Here is my configuration file.
The problem has been fixed, with the error being "not specifying the server name in the Oracle Enterprise Stage".
{
node "node0"
{
fastname "DBPU2"
pools ""
resource disk "/opt/tempdataset/" {pools ""}
resource scratchdisk "/opt/tempdataset/scratch/" {pools ""}
}
node "node1"
{
fastname "DBPU2"
pools ""
resource disk "/z03/tempdataset/" {pools ""}
resource scratchdisk "/z03/tempdataset/scratch/" {pools ""}
}
node "node2"
{
fastname "DBPU2"
pools ""
resource disk "/tmp/tempdataset/" {pools ""}
resource scratchdisk "/tmp/tempdataset/scratch/" {pools ""}
}
node "node3"
{
fastname "DBPU2"
pools ""
resource disk "/opt/tempdataset/" {pools ""}
resource scratchdisk "/opt/tempdataset/scratch/" {pools ""}
}
}
Here is my configuration file.
The problem has been fixed, with the error being "not specifying the server name in the Oracle Enterprise Stage".
{
node "node0"
{
fastname "DBPU2"
pools ""
resource disk "/opt/tempdataset/" {pools ""}
resource scratchdisk "/opt/tempdataset/scratch/" {pools ""}
}
node "node1"
{
fastname "DBPU2"
pools ""
resource disk "/z03/tempdataset/" {pools ""}
resource scratchdisk "/z03/tempdataset/scratch/" {pools ""}
}
node "node2"
{
fastname "DBPU2"
pools ""
resource disk "/tmp/tempdataset/" {pools ""}
resource scratchdisk "/tmp/tempdataset/scratch/" {pools ""}
}
node "node3"
{
fastname "DBPU2"
pools ""
resource disk "/opt/tempdataset/" {pools ""}
resource scratchdisk "/opt/tempdataset/scratch/" {pools ""}
}
}
-
- Participant
- Posts: 82
- Joined: Thu Dec 02, 2004 10:27 pm
- Location: INDIA
Yes, The dataset was created using the same configuration file. There was no porting done, across projects or systems.
Well, while porting the projects, from dev to test/ prdn., checking on the configuration file, and to maintain consistencies will avoid any such errors.
But my problem was that, it was not reflecting the error message that "the db server" was not mentioned. Rather the log indicates relook at the configuration/installation.
Well, while porting the projects, from dev to test/ prdn., checking on the configuration file, and to maintain consistencies will avoid any such errors.
But my problem was that, it was not reflecting the error message that "the db server" was not mentioned. Rather the log indicates relook at the configuration/installation.
-
- Participant
- Posts: 82
- Joined: Thu Dec 02, 2004 10:27 pm
- Location: INDIA
Nelab
What is z03 (node1 ) ?? Do you have a folder by that name or is it a typo.
Have you resolved your problem??
What is z03 (node1 ) ?? Do you have a folder by that name or is it a typo.
Have you resolved your problem??
nelab28 wrote:
node "node1"
{
fastname "DBPU2"
pools ""
resource disk "/z03/tempdataset/" {pools ""}
resource scratchdisk "/z03/tempdataset/scratch/" {pools ""}
}
dsxuserrio
Kannan.N
Bangalore,INDIA
Kannan.N
Bangalore,INDIA