Does DB2 Enterprise stage work on different domains?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
mavrick21
Premium Member
Premium Member
Posts: 335
Joined: Sun Apr 23, 2006 11:25 pm

Does DB2 Enterprise stage work on different domains?

Post by mavrick21 »

Hi,

Can we configure DB2 Enterprise Stage to work in the below scenario?

DataStage engine and repository are installed on, say, host1.abc.com. We have DB2 database installed on a different host but with same domain i.e., abc.com and DB2 Enterprise stage works fine.

There is DB2 database installed on a different domain, say, host2.xyz.com but it's on the same network (Same company with 2 different domain names i.e., parent company and its subsidiary). That is we can 'rsh' from host1.abc.com to host2.xyz.com and vice-versa, without using a password, but a fully qualified name has to be provided (<hostname>.<domain>.com)

Also host1.abc.com and host2.xyz.com have the same OS and hardware versions.

Would this work? If 'yes' then what should be the fastname of host2.xyz.com that should be put in APT_CONFIG_FILE? host2? Assuming I have root privileges in UNIX box how do I configure the fastname for host2.xyz.com?


Below are the steps we want to follow:
1) Copy PX engine from host1.abc.com to host2.xyz.com using 'copy-orchdist' command present in <PXEngine dir>/install/
2) Update/add a few environmental variables like APT_CONFIG_FILE, APT_DB2INSTANCE_HOME, APT_ORCHHOME, ASBHOME, DSHOME, LIBPATH, LD_LIBRARY_PATH, PATH, UDTBIN, UDTHOME and ODBCINI.
3) Add the new DB2 nodes(i.e., with appropriate fastnames) in APT_CONFIG_FILE.

Would the above 3 steps suffice assuming DB2 Enterprise stage can work across different domains?

Thanks in advance for your valuable time and suggestions.
mavrick21
Premium Member
Premium Member
Posts: 335
Joined: Sun Apr 23, 2006 11:25 pm

Post by mavrick21 »

Since I didn't get any replies from DS Gurus I went ahead and did the following:

1) Setup ${HOME}/.rhosts file in both hq1.subsidiary.com and hq1 server. I am able to rsh successfully from hq1.subsidiary.com to hq1 and vice-versa.

Note 1: hq1 server contains DS engine and hq1.subsidiary.com is the server containing DB2 server. I wanted to use DB2 Enterprise Stage for DB2 installed on hq1.subsidiary.com
Note 2: hq1 and hq1.subsidiary.com servers have same OS and Hardware (AIX 5.3)

2) Ran 'copy-orchdist hq1.subsidiary.com' from the server 'hq1' which completed successfully: copy-orchdist finished SUCCESSFULLY.

3) Added export commands in $HOME/.profile to export the following environment variables in hq1.subsidiary.com server: DSHOME,DB2DIR,DB2INSTANCE,DB2PATH,APT_DB2INSTANCE_HOME,APT_ORCHHOME,ASBHOME,LIBPATH,LD_LIBRARY_PATH,ODBCINI,PATH

4) Changed the APT_CONFIG_FILE to

Code: Select all

 
{
	node "node1"
	{
		fastname "hq1"
		pools ""
		resource disk "/opt/IBM/Data/Node1/Datasets" {pools ""}
		resource disk "/opt/IBM/Data/Node2/Datasets" {pools ""}
		resource scratchdisk "/opt/IBM/Data/Node1/Scratch" {pools ""}
		resource scratchdisk "/opt/IBM/Data/Node2/Scratch" {pools ""}
	}	
       node "db2node1"
	{
		fastname "hq1.subsidiary.com"
		pools "db2"
		resource scratchdisk "/home/dsadm/temp" {pools ""}
		resource disk "/home/dsadm/temp" {pools ""}
	}

}


When I test the configuration file I face an error:

Code: Select all

##I IIS-DSEE-TFCN-00001 13:18:01(000) <main_program> 
IBM WebSphere DataStage Enterprise Edition 8.0.1.5183 
Copyright (c) 2001, 2005-2007 IBM Corporation. All rights reserved

##I IIS-DSEE-TUTL-00031 13:18:01(001) <main_program> The open files limit is 2000; raising to 2147483647.
##I IIS-DSEE-TFCN-00006 13:18:01(002) <main_program> conductor uname: -s=AIX; -r=3; -v=5; -n=hq1; -m=0003D7B9D600
##I IIS-DSEE-TCOA-00067 13:18:01(003) <main_program> OS charset: ISO-8859-1.
##I IIS-DSEE-TCOA-00068 13:18:01(004) <main_program> Input charset: UTF-8.
##I IIS-DSEE-TFSC-00001 13:18:01(005) <main_program> APT configuration file: /opt/IBM/InformationServer/Server/Configurations/NV.apt
##E IIS-DSEE-TFPM-00330 13:18:01(006) <main_program> The Section Leader on node db2node1 has terminated unexpectedly.
##F IIS-DSEE-TFPM-00113 13:20:47(000) <APT_CheckConfigOperator,0> Fatal Error: Unable to start ORCHESTRATE network connection on node node1(hq1): COMPLETEWAIT failed: parallel APT_CheckConfigOperator(0,0)
##F IIS-DSEE-TFPM-00114 13:20:47(000) <APT_RealFileExportOperator in APT_FileExportOperator,0> Fatal Error: Unable to start ORCHESTRATE network connection on node node1 (bi-etl-dev):  APT_PMConnectionSetup:: operator 1(sequential APT_RealFileExportOperator in APT_FileExportOperator)timed out with 1 incomplete incoming connections.
##E IIS-DSEE-TFPM-00192 13:20:47(000) <node_node1> Player 1 terminated unexpectedly.
##E IIS-DSEE-TFPM-00338 13:20:47(000) <main_program> APT_PMsectionLeader(1, node1), player 1 - Unexpected exit status 1.
##E IIS-DSEE-TFPM-00192 13:20:47(001) <node_node1> Player 2 terminated unexpectedly.
##E IIS-DSEE-TFPM-00338 13:20:47(001) <main_program> APT_PMsectionLeader(1, node1), player 2 - Unexpected exit status 1.
##W IIS-DSEE-TFPM-00091 13:20:52(000) <main_program> APT_PMpollUntilZero: WARNING: called with counter = 0
##E IIS-DSEE-TFSC-00011 13:20:57(000) <main_program> Step execution finished with status = FAILED.
##E IIS-DSEE-TCOA-00069 13:20:57(001) <main_program> ERROR: check configuration file failed.
5) I added $APT_PM_CONDUCTOR_TIMEOUT to environment variable and set its value to 600 thinking that this might be because of timeout issue since I can successfully rsh between these 2 machines.
Why do I still get "The Section Leader on node db2node1 has terminated unexpectedly."?

Can anyone please help me out?

Thanks for your time.
mavrick21
Premium Member
Premium Member
Posts: 335
Joined: Sun Apr 23, 2006 11:25 pm

Post by mavrick21 »

This is strange!

Now I get the following error:

Code: Select all

##I IIS-DSEE-TFCN-00001 13:55:21(000) <main_program> 
IBM WebSphere DataStage Enterprise Edition 8.0.1.5183 
Copyright (c) 2001, 2005-2007 IBM Corporation. All rights reserved
 


##I IIS-DSEE-TUTL-00031 13:55:21(001) <main_program> The open files limit is 2000; raising to 2147483647.
##I IIS-DSEE-TFCN-00006 13:55:21(002) <main_program> conductor uname: -s=AIX; -r=3; -v=5; -n=hq1; -m=0003D7B9D600
##I IIS-DSEE-TCOA-00067 13:55:21(003) <main_program> OS charset: ISO-8859-1.
##I IIS-DSEE-TCOA-00068 13:55:21(004) <main_program> Input charset: UTF-8.
##I IIS-DSEE-TFSC-00001 13:55:21(005) <main_program> APT configuration file: /opt/IBM/InformationServer/Server/Configurations/NV.apt
##F IIS-DSEE-TFPM-00347 13:55:22(000) <main_program> Fatal Error: Service table transmission failed for db2node1 (hq1.subsidiary.com:Broken pipe.  This may indicate a network problem. Setting APT_PM_CONDUCTOR_TIMEOUT to a larger value (when unset, it defaults to 60) may alleviate this problem.
I have already set APT_PM_CONDUCTOR_TIMEOUT to 600 using Administrator for this project.

Any idea why I get these errors? Can this be because fastname of DB2 server contains domain name?

Thanks for your time.
mavrick21
Premium Member
Premium Member
Posts: 335
Joined: Sun Apr 23, 2006 11:25 pm

Post by mavrick21 »

Can this be the problem?

Let DSEngine server be hq1 and server having DB2 be hq2.subsidiary.com.

I'm able to rsh from hq2.subsidiary.com to hq1 just by issuing
$ rsh hq1

but when I try the same from hq1 to hq2.subsidiary.com by issuing
$ rsh hq2
It doesn't work.

But $rsh hq2.subsidiary.com works.

Should I ask UNIX admin to assign a fast name to hq2.subsidiary.com so that we can $ rsh hq2 from hq1?
mavrick21
Premium Member
Premium Member
Posts: 335
Joined: Sun Apr 23, 2006 11:25 pm

Post by mavrick21 »

We got the rsh issue resolved. However we still face the below error.

Code: Select all

##I IIS-DSEE-TFCN-00001 13:55:21(000) <main_program> 
IBM WebSphere DataStage Enterprise Edition 8.0.1.5183 
Copyright (c) 2001, 2005-2007 IBM Corporation. All rights reserved 
  


##I IIS-DSEE-TUTL-00031 13:55:21(001) <main_program> The open files limit is 2000; raising to 2147483647. 
##I IIS-DSEE-TFCN-00006 13:55:21(002) <main_program> conductor uname: -s=AIX; -r=3; -v=5; -n=hq1; -m=0003D7B9D600 
##I IIS-DSEE-TCOA-00067 13:55:21(003) <main_program> OS charset: ISO-8859-1. 
##I IIS-DSEE-TCOA-00068 13:55:21(004) <main_program> Input charset: UTF-8. 
##I IIS-DSEE-TFSC-00001 13:55:21(005) <main_program> APT configuration file: /opt/IBM/InformationServer/Server/Configurations/NV.apt 
##F IIS-DSEE-TFPM-00347 13:55:22(000) <main_program> Fatal Error: Service table transmission failed for db2node1 (hq1.subsidiary.com:Broken pipe.  This may indicate a network problem. Setting APT_PM_CONDUCTOR_TIMEOUT to a larger value (when unset, it defaults to 60) may alleviate this problem.
Any idea why this might be?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Just so you're not feeling so alone and spending all your time talking to yourself, thought I'd stop by. I've only been running PX on a single server so far and don't even have a second box to play with, so can't really add anything relevant to the discussion. :(

Well, other than to ask: what exactly did you do to resolve the "rsh issue"?
-craig

"You can never have too many knives" -- Logan Nine Fingers
mavrick21
Premium Member
Premium Member
Posts: 335
Joined: Sun Apr 23, 2006 11:25 pm

Post by mavrick21 »

UNIX admins added a SEARCH string to /etc/resolv.conf file.

Craig,
Even though rsh is working fine using PUTTY why am I facing this error? Any guesses?

Thanks.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Not really. Broken pipes can be tricky to track down, things like network issues (as noted), permissions problems where the named pipe is being created, disk space issues there, etc etc can cause them. Perhaps googling for broken pipe trouble-shooting help could turn something up?
-craig

"You can never have too many knives" -- Logan Nine Fingers
mavrick21
Premium Member
Premium Member
Posts: 335
Joined: Sun Apr 23, 2006 11:25 pm

Post by mavrick21 »

Craig,

When I test the config file I see the following error

Code: Select all

 ##F IIS-DSEE-TFPM-00347 13:55:22(000) <main_program> Fatal Error: Service table transmission failed for db2node1 (hq1.subsidiary.com:Broken pipe.  This may indicate a network problem. Setting APT_PM_CONDUCTOR_TIMEOUT to a larger value (when unset, it defaults to 60) may alleviate this problem. 
If I set APT_PM_CONDUCTOR_TIMEOUT to a large value, say 6000, I still see this error. Does this mean that when we test config file the values of APT_* variables are not considered?

Thanks.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I would think that they are but honestly I can't say for sure. You'd probably need to confirm or deny that with your official support provider.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply