Configuration File Issue in Parellel Jobs

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

ag_ram
Premium Member
Premium Member
Posts: 524
Joined: Wed Feb 28, 2007 3:51 am

Configuration File Issue in Parellel Jobs

Post by ag_ram »

Hi,
I have installed Datastage 8.0.1 in Windows Server 2003. The Server jobs are running fine, but for Parallel jobs i am getting the following error while performing Configuration file Check.


##I IIS-DSEE-TFCN-00001 10:11:18(000) <main_program>
IBM WebSphere DataStage Enterprise Edition 8.0.1.4458
Copyright (c) 2001, 2005-2007 IBM Corporation. All rights reserved



##I IIS-DSEE-TCOA-00067 10:11:18(001) <main_program> OS charset: windows-1252.
##I IIS-DSEE-TCOA-00068 10:11:18(002) <main_program> Input charset: UTF-8.
##I IIS-DSEE-TFSC-00001 10:11:18(003) <main_program> APT configuration file: C:/IBM/InformationServer/Server/Configurations/default.apt
Toolkit\mksnt\sh.exe: Toolkit\mksnt\sh.exe: not found
##W IIS-DSEE-TFPM-00152 10:11:48(000) <main_program> Accept timed out retries = 4
##E IIS-DSEE-TFPM-00153 10:11:48(001) <main_program> The section leader on TVMKVM95050D died
##E IIS-DSEE-TFPM-00356 10:11:48(002) <main_program>

**** Parallel startup failed ****

This is usually due to a configuration error, such as
not having the Orchestrate install directory properly
mounted on all nodes, rsh permissions not correctly
set (via /etc/hosts.equiv or .rhosts), or running from
a directory that is not mounted on all nodes. Look for
error messages in the preceding output.


##I IIS-DSEE-TFPM-00177 10:11:48(003) <main_program> Step started on node TVMKVM95050D; it uses 1 nodes.
The program running the step is /C=/IBM/InformationServer/Server/PXEngine/bin/orchadmin.exe.

##I IIS-DSEE-TFPM-00178 10:11:48(004) <main_program> The ORCHESTRATE startup program in /C=/IBM/InformationServer/Server/PXEngine/etc/standalone.sh is being used.

##I IIS-DSEE-TFPM-00181 10:11:48(005) <main_program> A startup script is not being used.

##I IIS-DSEE-TFPM-00183 10:11:48(006) <main_program> The TCP port being used for startup is 10,000; the associated socket number is 3.

##I IIS-DSEE-TFPM-00184 10:11:48(007) <main_program>
Node status:


##I IIS-DSEE-TFPM-00185 10:11:48(008) <main_program> TVMKVM95050D -
##I IIS-DSEE-TFPM-00187 10:11:48(009) <main_program> rsh issued, no response received


##E IIS-DSEE-TFPM-00247 10:11:48(010) <main_program> Unable to contact one or more Section Leaders.
Probable configuration problem; contact Orchestrate system administrator.

##E IIS-DSEE-TFSC-00011 10:11:48(011) <main_program> Step execution finished with status = FAILED.
##E IIS-DSEE-TCOA-00069 10:11:48(012) <main_program> ERROR: check configuration file failed.


Because of this, parallel jobs are also getting aborted. Please help.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Because of what?

There are a couple of errors in the output. Did you review/remedy where it suggested that you do so?

That it can't find the MKS Toolkit sh.exe program is probably an issue with your PATH environment variable (command search list).

Inability to execute on remote nodes is usually incomplete setup of rsh and/or the permissions required for it.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ag_ram
Premium Member
Premium Member
Posts: 524
Joined: Wed Feb 28, 2007 3:51 am

Post by ag_ram »

The PATH Environment variable does have a value C:\PROGRA~1\MKS Toolkit\mksnt; which corresponds to the folder having sh.exe. Still i am getting the sh.exe not found error. I even tried replacing the short path variant with PROGRAM FILES, but with no success.
lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

Actually, there should be a "SHELL" environment variable populated under System Variables that points to C:\PROGRA~1\MKS Toolkit\mksnt\sh.exe.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

IS that the actual PATH variable for the user ID executing your DataStage job? (Add ExecSH as a before job subroutine to execute the id command to find out who this user is.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ag_ram
Premium Member
Premium Member
Posts: 524
Joined: Wed Feb 28, 2007 3:51 am

Post by ag_ram »

The SHELL environment variable is also present with the value C:/PROGRA~1/MKS Toolkit/mksnt/sh.exe
ag_ram
Premium Member
Premium Member
Posts: 524
Joined: Wed Feb 28, 2007 3:51 am

Post by ag_ram »

I checked for the user id executing the DS job. It is returning the correct user id in the Domain\Username format.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Looks to me like it is saying that it can't find 'default.apt' not 'sh.exe'. And that whole 'rsh issued, no response received' is problematical as well.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There's no problem with the configuration file - that one's an informational message (see the "##I") reporting which configuration file is in use.

I still think it's .rhosts or hosts.equiv, and it's probably one (or more) of the remote nodes that can't see sh.exe.

Have you made the parallel engine visible on all nodes?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

He is running DS on Windows Server 2003. There is no .rhosts or .hosts.equiv in Windows Server 2003. I think it's MKS Toolkit caused the problem. Do you have multiple versions of MKS Toolkit on that server or do you have cygwin install on that box?
ag_ram
Premium Member
Premium Member
Posts: 524
Joined: Wed Feb 28, 2007 3:51 am

Post by ag_ram »

There is only one node in which i am working on. Also I have only MKS Platform Components 9.x installed, that too by the Suite itself and do not have Cygwin installed in that machine. I doubt, whether the Space present in the name "MKS Toolkit" is causing the problem because in the error displayed, it is showing Toolkit/mksnt/sh.exe whereas actually it should be C:Program Files/MKS Toolkit/mksnt/sh.exe. Is there any way by which we can specify the MKS installation directory while installing the Suite?
lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

I also noticed that "C:/Program Files/MKS" part is missing. Please open a command prompt and type the following:
sh
>which bash
> echo $SHELL

Check the directories returned from above are correct.
ag_ram
Premium Member
Premium Member
Posts: 524
Joined: Wed Feb 28, 2007 3:51 am

Post by ag_ram »

The directories are correct.
C:\PROGRA~1\MKS Toolkit\mksnt/bash.exe and
C:/PROGRA~1/MKS Toolkit/mksnt/sh.exe
lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

Hmmm... from my Windows Server 2003, I get C:\PROGRA~1\MKSTOO~1\mksnt/bash.exe and C:/PROGRA~1/MKSTOO~1/mksnt/sh.exe. No space between MKS and Toolkit.
ag_ram
Premium Member
Premium Member
Posts: 524
Joined: Wed Feb 28, 2007 3:51 am

Post by ag_ram »

Did you mention anywhere, the installation directory for MKS Toolkit while installing Information Server(v 8.0.1)?
Post Reply