Configuration File issue,Configuring DB2 EE stage
Posted: Thu Jun 11, 2009 10:22 pm
Hi,
I am trying to configure DB2 EE stage between ETL server( 32 bit linux OS) and remote DB2 Server( 64 bit linux OS). Opened all required ports in the firewall between these servers(22,50100 etc.,)
We have all PXEngine,DSEngine,DSComponent,Configuration etc., folders exactly same on both server with owner dsadm and group dstage.
could execute osh "hostname" on db2 server by keeping only that node in apt file.
We have only one user dsadm. I did set up the "SSH" between these two servers. could execute the command "ssh lcetldbd1 date" without any prompts are issues.
I did required changes in remsh and dscomponent files.
like in remsh "exec /usr/bin/ssh "@$" etc., I tried with startup.apt and by renaming it also.,
PATH , LD_LIBRARY_PATH points to ..PXEngine/etc:..... etc.,
I am getting below error while check on my config file.
Why from the designer or osh "hostname" command I could not get ssh connectiviy when I could get it from the prompt with the same user.
Please let me know how to debug and Fix these errors on check or
If I am missing any thing in this process.
Thanks in advance in helping me on this.
apt file
{
node "node1"
{
fastname "GLBSMU7"
pools ""
resource disk "/opt/data/Datasets" {pools ""}
resource scratchd isk "/opt/data/Scratch" {pools ""}
}
node "node2"
{
fastname "GLBSMU7"
pools ""
resource disk "/opt/data/Datasets" {pools ""}
resource scratchdisk "/opt/data/Scratch" {pools ""}
}
node "node3"
{
fastname "GLBSMU7"
pools ""
resource disk "/opt/data/Datasets" {pools ""}
resource scratchdisk "/opt/data/Scratch" {pools ""}
}
node "node4"
{
fastname "GLBSMU7"
pools ""
resource disk "/opt/data/Datasets" {pools ""}
resource scratchdisk "/opt/data/Scratch" {pools ""}
}
node "db2node1"
{
fastname "lcetldbd1"
pools "DB2"
resource disk "/tmp/" {pools ""}
resource scratchdisk "/tmp/" {pools ""}
}
}
ERROR:
##I IIS-DSEE-TFCN-00001 22:14:12(000) <main_program>
IBM WebSphere DataStage Enterprise Edition 8.1.0.5000
Copyright (c) 2001, 2005-2008 IBM Corporation. All rights reserved
##I IIS-DSEE-TFCN-00006 22:14:12(001) <main_program> conductor uname: -s=Linux; -r=2.6.9-67.ELsmp; -v=#1 SMP Wed Nov 7 13:58:04 EST 2007; -n=GLBSMU7; -m=i686
##I IIS-DSEE-TCOA-00067 22:14:12(002) <main_program> OS charset: UTF-8.
##I IIS-DSEE-TCOA-00068 22:14:12(003) <main_program> Input charset: UTF-8.
##I IIS-DSEE-TFSC-00001 22:14:12(004) <main_program> APT configuration file: /opt/IBM/InformationServer/Server/Configurations/default2.apt
##W IIS-DSEE-TFPM-00152 22:14:42(000) <main_program> Accept timed out retries = 20
##E IIS-DSEE-TFPM-00141 Timeout in step setup on node db2node1
##W IIS-DSEE-TFPM-00152 22:15:12(000) <main_program> Accept timed out retries = 19
##E IIS-DSEE-TFPM-00153 22:15:12(001) <main_program> The section leader on lcetldbd1 died
##E IIS-DSEE-TFPM-00356 22:15:12(002) <main_program>
**** Parallel startup failed ****
This is usually due to a configuration error, such as
not having the Orchestrate install directory properly
mounted on all nodes, rsh permissions not correctly
set (via /etc/hosts.equiv or .rhosts), or running from
a directory that is not mounted on all nodes. Look for
error messages in the preceding output.
##I IIS-DSEE-TFPM-00177 22:15:12(003) <main_program> Step started on node GLBSMU7; it uses 5 nodes.
The program running the step is /opt/IBM/InformationServer/Server/PXEngine/bin/orchadmin.
##I IIS-DSEE-TFPM-00178 22:15:12(004) <main_program> The ORCHESTRATE startup program in /opt/IBM/InformationServer/Server/PXEngine/etc/standalone.sh is being used.
##I IIS-DSEE-TFPM-00181 22:15:12(005) <main_program> A startup script is not being used.
##I IIS-DSEE-TFPM-00183 22:15:12(006) <main_program> The TCP port being used for startup is 10,001; the associated socket number is 5.
##I IIS-DSEE-TFPM-00184 22:15:12(007) <main_program>
Node status:
##I IIS-DSEE-TFPM-00185 22:15:12(008) <main_program> GLBSMU7 -
##I IIS-DSEE-TFPM-00186 22:15:12(009) <main_program> OK
##I IIS-DSEE-TFPM-00185 22:15:12(010) <main_program> GLBSMU7 -
##I IIS-DSEE-TFPM-00186 22:15:12(011) <main_program> OK
##I IIS-DSEE-TFPM-00185 22:15:12(012) <main_program> GLBSMU7 -
##I IIS-DSEE-TFPM-00186 22:15:12(013) <main_program> OK
##I IIS-DSEE-TFPM-00185 22:15:12(014) <main_program> GLBSMU7 -
##I IIS-DSEE-TFPM-00186 22:15:12(015) <main_program> OK
##I IIS-DSEE-TFPM-00185 22:15:12(016) <main_program> lcetldbd1 -
##I IIS-DSEE-TFPM-00187 22:15:12(017) <main_program> rsh issued, no response received
##E IIS-DSEE-TFPM-00247 22:15:12(018) <main_program> Unable to contact one or more Section Leaders.
Probable configuration problem; contact Orchestrate system administrator.
##E IIS-DSEE-TFSC-00011 22:15:12(019) <main_program> Step execution finished with status = FAILED.
##E IIS-DSEE-TCOA-00069 22:15:12(020) <main_program> ERROR: check configuration file failed.
I am trying to configure DB2 EE stage between ETL server( 32 bit linux OS) and remote DB2 Server( 64 bit linux OS). Opened all required ports in the firewall between these servers(22,50100 etc.,)
We have all PXEngine,DSEngine,DSComponent,Configuration etc., folders exactly same on both server with owner dsadm and group dstage.
could execute osh "hostname" on db2 server by keeping only that node in apt file.
We have only one user dsadm. I did set up the "SSH" between these two servers. could execute the command "ssh lcetldbd1 date" without any prompts are issues.
I did required changes in remsh and dscomponent files.
like in remsh "exec /usr/bin/ssh "@$" etc., I tried with startup.apt and by renaming it also.,
PATH , LD_LIBRARY_PATH points to ..PXEngine/etc:..... etc.,
I am getting below error while check on my config file.
Why from the designer or osh "hostname" command I could not get ssh connectiviy when I could get it from the prompt with the same user.
Please let me know how to debug and Fix these errors on check or
If I am missing any thing in this process.
Thanks in advance in helping me on this.
apt file
{
node "node1"
{
fastname "GLBSMU7"
pools ""
resource disk "/opt/data/Datasets" {pools ""}
resource scratchd isk "/opt/data/Scratch" {pools ""}
}
node "node2"
{
fastname "GLBSMU7"
pools ""
resource disk "/opt/data/Datasets" {pools ""}
resource scratchdisk "/opt/data/Scratch" {pools ""}
}
node "node3"
{
fastname "GLBSMU7"
pools ""
resource disk "/opt/data/Datasets" {pools ""}
resource scratchdisk "/opt/data/Scratch" {pools ""}
}
node "node4"
{
fastname "GLBSMU7"
pools ""
resource disk "/opt/data/Datasets" {pools ""}
resource scratchdisk "/opt/data/Scratch" {pools ""}
}
node "db2node1"
{
fastname "lcetldbd1"
pools "DB2"
resource disk "/tmp/" {pools ""}
resource scratchdisk "/tmp/" {pools ""}
}
}
ERROR:
##I IIS-DSEE-TFCN-00001 22:14:12(000) <main_program>
IBM WebSphere DataStage Enterprise Edition 8.1.0.5000
Copyright (c) 2001, 2005-2008 IBM Corporation. All rights reserved
##I IIS-DSEE-TFCN-00006 22:14:12(001) <main_program> conductor uname: -s=Linux; -r=2.6.9-67.ELsmp; -v=#1 SMP Wed Nov 7 13:58:04 EST 2007; -n=GLBSMU7; -m=i686
##I IIS-DSEE-TCOA-00067 22:14:12(002) <main_program> OS charset: UTF-8.
##I IIS-DSEE-TCOA-00068 22:14:12(003) <main_program> Input charset: UTF-8.
##I IIS-DSEE-TFSC-00001 22:14:12(004) <main_program> APT configuration file: /opt/IBM/InformationServer/Server/Configurations/default2.apt
##W IIS-DSEE-TFPM-00152 22:14:42(000) <main_program> Accept timed out retries = 20
##E IIS-DSEE-TFPM-00141 Timeout in step setup on node db2node1
##W IIS-DSEE-TFPM-00152 22:15:12(000) <main_program> Accept timed out retries = 19
##E IIS-DSEE-TFPM-00153 22:15:12(001) <main_program> The section leader on lcetldbd1 died
##E IIS-DSEE-TFPM-00356 22:15:12(002) <main_program>
**** Parallel startup failed ****
This is usually due to a configuration error, such as
not having the Orchestrate install directory properly
mounted on all nodes, rsh permissions not correctly
set (via /etc/hosts.equiv or .rhosts), or running from
a directory that is not mounted on all nodes. Look for
error messages in the preceding output.
##I IIS-DSEE-TFPM-00177 22:15:12(003) <main_program> Step started on node GLBSMU7; it uses 5 nodes.
The program running the step is /opt/IBM/InformationServer/Server/PXEngine/bin/orchadmin.
##I IIS-DSEE-TFPM-00178 22:15:12(004) <main_program> The ORCHESTRATE startup program in /opt/IBM/InformationServer/Server/PXEngine/etc/standalone.sh is being used.
##I IIS-DSEE-TFPM-00181 22:15:12(005) <main_program> A startup script is not being used.
##I IIS-DSEE-TFPM-00183 22:15:12(006) <main_program> The TCP port being used for startup is 10,001; the associated socket number is 5.
##I IIS-DSEE-TFPM-00184 22:15:12(007) <main_program>
Node status:
##I IIS-DSEE-TFPM-00185 22:15:12(008) <main_program> GLBSMU7 -
##I IIS-DSEE-TFPM-00186 22:15:12(009) <main_program> OK
##I IIS-DSEE-TFPM-00185 22:15:12(010) <main_program> GLBSMU7 -
##I IIS-DSEE-TFPM-00186 22:15:12(011) <main_program> OK
##I IIS-DSEE-TFPM-00185 22:15:12(012) <main_program> GLBSMU7 -
##I IIS-DSEE-TFPM-00186 22:15:12(013) <main_program> OK
##I IIS-DSEE-TFPM-00185 22:15:12(014) <main_program> GLBSMU7 -
##I IIS-DSEE-TFPM-00186 22:15:12(015) <main_program> OK
##I IIS-DSEE-TFPM-00185 22:15:12(016) <main_program> lcetldbd1 -
##I IIS-DSEE-TFPM-00187 22:15:12(017) <main_program> rsh issued, no response received
##E IIS-DSEE-TFPM-00247 22:15:12(018) <main_program> Unable to contact one or more Section Leaders.
Probable configuration problem; contact Orchestrate system administrator.
##E IIS-DSEE-TFSC-00011 22:15:12(019) <main_program> Step execution finished with status = FAILED.
##E IIS-DSEE-TCOA-00069 22:15:12(020) <main_program> ERROR: check configuration file failed.