Configuration File issue
Posted: Thu Jan 04, 2007 5:13 am
Hi All,
I am having two machines namely system-1 and system-2. I want to cluster these two machines and run the parallel job.
So i have configured my Configuration file as follows.
{
node "node1"
{
fastname "system-1"
pools ""
resource disk "C:/Ascential/DataStage/Datasets" { pools "" }
resource disk "D:/Ascential/DataStage/Datasets" { pools "" }
resource scratchdisk "C:/Ascential/DataStage/Scratch" { pools "" }
resource scratchdisk "D:/Ascential/DataStage/Scratch" { pools "" }
}
node "node2"
{
fastname "system-2"
pools ""
resource disk "C:/Ascential/DataStage/Datasets" { }
resource disk "D:/Ascential/DataStage/Datasets" { }
resource scratchdisk "C:/Ascential/DataStage/Scratch" { }
resource scratchdisk "D:/Ascential/DataStage/Scratch" { }
}
But After i have configured this file and try to check it, i am getting the error message as follows.
##I TFCN 000001 16:30:51(000) <main_program>
Ascential DataStage(tm) Enterprise Edition 7.5
Copyright (c) 2004, 1997-2004 Ascential Software Corporation.
All Rights Reserved
##I TOCK 000000 16:30:51(001) <main_program> OS Charset: ISO-8859-1
##I TOCK 000000 16:30:51(002) <main_program> Input Charset: UTF-8
##I TFSC 000001 16:30:51(003) <main_program> APT configuration file: C:/Ascential/DataStage/Configurations/fit2.apt
RSHD: FIT: could not retrieve password: Please login and run rsetup.
##E TFPM 000152 16:31:22(000) <main_program> Accept timed out retries = 8
##E TFPM 000153 16:31:22(001) <main_program> The section leader on system-1 died
##E TFPM 000356 16:31:22(002) <main_program>
**** Parallel startup failed ****
This is usually due to a configuration error, such as
not having the Orchestrate install directory properly
mounted on all nodes, rsh permissions not correctly
set (via /etc/hosts.equiv or .rhosts), or running from
a directory that is not mounted on all nodes. Look for
error messages in the preceding output.
##I TFPM 000177 16:31:22(003) <main_program> Step started on node system-2; it uses 2 nodes.
The program running the step is /C=/Ascential/DataStage/PXEngine/bin/orchadmin.exe.
##I TFPM 000178 16:31:22(004) <main_program> The ORCHESTRATE startup program in /C=/Ascential/DataStage/PXEngine/etc/standalone.sh is being used.
##I TFPM 000181 16:31:22(005) <main_program> A startup script is not being used.
##I TFPM 000183 16:31:22(006) <main_program> The TCP port being used for startup is 10000; the associated socket number is 3.
##I TFPM 000184 16:31:22(007) <main_program>
Node status:
##I TFPM 000185 16:31:22(008) <main_program> system-2 -
##I TFPM 000186 16:31:22(009) <main_program> OK
##I TFPM 000185 16:31:22(010) <main_program> system-1 -
##I TFPM 000187 16:31:22(011) <main_program> rsh issued, no response received
##E TFPM 000247 16:31:22(012) <main_program> Unable to contact one or more Section Leaders.
Probable configuration problem; contact Orchestrate system administrator.
##E TFSR 000011 16:31:22(013) <main_program> Step execution finished with status = FAILED.
##E TOCK 000000 16:31:22(014) <main_program>
ERROR: Check configuration file failed.
Note : The OS we are using is Windows - 2003 Enterprise edition.
Any help would be appreciated.
Thanks in advance.
Gopu
I am having two machines namely system-1 and system-2. I want to cluster these two machines and run the parallel job.
So i have configured my Configuration file as follows.
{
node "node1"
{
fastname "system-1"
pools ""
resource disk "C:/Ascential/DataStage/Datasets" { pools "" }
resource disk "D:/Ascential/DataStage/Datasets" { pools "" }
resource scratchdisk "C:/Ascential/DataStage/Scratch" { pools "" }
resource scratchdisk "D:/Ascential/DataStage/Scratch" { pools "" }
}
node "node2"
{
fastname "system-2"
pools ""
resource disk "C:/Ascential/DataStage/Datasets" { }
resource disk "D:/Ascential/DataStage/Datasets" { }
resource scratchdisk "C:/Ascential/DataStage/Scratch" { }
resource scratchdisk "D:/Ascential/DataStage/Scratch" { }
}
But After i have configured this file and try to check it, i am getting the error message as follows.
##I TFCN 000001 16:30:51(000) <main_program>
Ascential DataStage(tm) Enterprise Edition 7.5
Copyright (c) 2004, 1997-2004 Ascential Software Corporation.
All Rights Reserved
##I TOCK 000000 16:30:51(001) <main_program> OS Charset: ISO-8859-1
##I TOCK 000000 16:30:51(002) <main_program> Input Charset: UTF-8
##I TFSC 000001 16:30:51(003) <main_program> APT configuration file: C:/Ascential/DataStage/Configurations/fit2.apt
RSHD: FIT: could not retrieve password: Please login and run rsetup.
##E TFPM 000152 16:31:22(000) <main_program> Accept timed out retries = 8
##E TFPM 000153 16:31:22(001) <main_program> The section leader on system-1 died
##E TFPM 000356 16:31:22(002) <main_program>
**** Parallel startup failed ****
This is usually due to a configuration error, such as
not having the Orchestrate install directory properly
mounted on all nodes, rsh permissions not correctly
set (via /etc/hosts.equiv or .rhosts), or running from
a directory that is not mounted on all nodes. Look for
error messages in the preceding output.
##I TFPM 000177 16:31:22(003) <main_program> Step started on node system-2; it uses 2 nodes.
The program running the step is /C=/Ascential/DataStage/PXEngine/bin/orchadmin.exe.
##I TFPM 000178 16:31:22(004) <main_program> The ORCHESTRATE startup program in /C=/Ascential/DataStage/PXEngine/etc/standalone.sh is being used.
##I TFPM 000181 16:31:22(005) <main_program> A startup script is not being used.
##I TFPM 000183 16:31:22(006) <main_program> The TCP port being used for startup is 10000; the associated socket number is 3.
##I TFPM 000184 16:31:22(007) <main_program>
Node status:
##I TFPM 000185 16:31:22(008) <main_program> system-2 -
##I TFPM 000186 16:31:22(009) <main_program> OK
##I TFPM 000185 16:31:22(010) <main_program> system-1 -
##I TFPM 000187 16:31:22(011) <main_program> rsh issued, no response received
##E TFPM 000247 16:31:22(012) <main_program> Unable to contact one or more Section Leaders.
Probable configuration problem; contact Orchestrate system administrator.
##E TFSR 000011 16:31:22(013) <main_program> Step execution finished with status = FAILED.
##E TOCK 000000 16:31:22(014) <main_program>
ERROR: Check configuration file failed.
Note : The OS we are using is Windows - 2003 Enterprise edition.
Any help would be appreciated.
Thanks in advance.
Gopu