How do I configure for multiple network interfaces on SMP?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

track_star
Participant
Posts: 60
Joined: Sat Jan 24, 2004 12:52 pm
Location: Mount Carmel, IL

Post by track_star »

OK....I should apologize for two things. I am a little slow in responding to this post, but I had to go dig up my info on how I saw this done. Also, this is a rather lengthy response--but then again, it's not a quick setup either. So go grab a cold beverage and some chips, and then let's consider the following scenario.

I have two servers, one of which runs PX (server A), the other runs DB2 (server B). I have a two network cards in each of the servers, one of which is a standard 10/100 NIC, the other of which is a Gbit NIC. I want to configure PX to use the Gbit NIC to get data into and out of DB2 faster. I have already setup the remote server to handle remote DB2 calls, and I have DB2 setup (in my db2nodes.cfg) to run on the remote node.

With that in mind, I need to gather some information so that I can construct the config file for PX to use. I need to get this info from both servers, BTW.

From ifconfig -a, I can see both network interfaces and their configurations. If I take the IP address listed in the ifconfig results, I can then issue 'host {IP addr}' which will tell me the name of the interface. I also need to know
the result of uname -n. This should match up with one of the network names that I just gathered. This gives me results like this:

server A:
int IP name speed
en0 192.168.1.100 mainetl 10/100
en1 192.168.1.101 fastetl Gbit

server B:
en0 192.168.1.102 maindb2 10/100
en1 192.168.1.103 fastdb2 Gbit

I could then create a config file that looked like this:

{
node "node0"
{
fastname "mainetl"
pools "Conductor"
resource disk "/Datasets" {pools ""}
resource scratchdisk "/Scratch" {pools ""}
}
node "node1"
{
fastname "fastetl"
pools ""
resource disk "/Datasets" {pools ""}
resource scratchdisk "/Scratch" {pools ""}
}
node "maindb2"
{
fastname "fastdb2"
pools "DB2"
resource disk "/Datasets" {pools ""}
resource scratchdisk "/Scratch" {pools ""}
}
}

HTH....
Ultramundane
Participant
Posts: 407
Joined: Mon Jun 27, 2005 8:54 am
Location: Walker, Michigan
Contact:

Post by Ultramundane »

I believe I have tried that, but jobs abort with these messages:

main_program: Accept timed out retries = 8

main_program: The section leader on rcrpdev1b died

main_program: **** Parallel startup failed ****
This is usually due to a configuration error, such as
not having the Orchestrate install directory properly
mounted on all nodes, rsh permissions not correctly
set (via /etc/hosts.equiv or .rhosts), or running from
a directory that is not mounted on all nodes. Look for
error messages in the preceding output.

APT_CONFIG_FILE:

Code: Select all

{
        node "rcrpdev101"
        {
                fastname "rcrpdev1a"
                pools "Conductor"
                resource disk "/Work/Ascential/Dset01" {pools ""}
                resource scratchdisk "/Work/Ascential/Scr01" {pools ""}
        }
        node "rcrpdev102"
        {
                fastname "rcrpdev1a"
                pools ""
                resource disk "/Work/Ascential/Dset01" {pools ""}
                resource scratchdisk "/Work/Ascential/Scr01" {pools ""}
        }
        node "rcrpdev103"
        {
                fastname "rcrpdev1b"
                pools ""
                resource disk "/Work/Ascential/Dset01" {pools ""}
                resource scratchdisk "/Work/Ascential/Scr01" {pools ""}
        }
}
track_star
Participant
Posts: 60
Joined: Sat Jan 24, 2004 12:52 pm
Location: Mount Carmel, IL

Post by track_star »

Do you have rsh setup on the remote server to allow the DS user to login without a password?
Ultramundane
Participant
Posts: 407
Joined: Mon Jun 27, 2005 8:54 am
Location: Walker, Michigan
Contact:

Post by Ultramundane »

I do. It is the same server. The interfaces rcrpdev1a and rcrpdev1b are two different interfaces into the same server.
track_star
Participant
Posts: 60
Joined: Sat Jan 24, 2004 12:52 pm
Location: Mount Carmel, IL

Post by track_star »

Are you saying that you are trying to call two network interfaces on the same server and then not specifying a remote server to connect to? If so, I'm not sure why you're trying to do that. If I am not understanding what you're saying, please let me know. Maybe if I saw your config file it would help.
Ultramundane
Participant
Posts: 407
Joined: Mon Jun 27, 2005 8:54 am
Location: Walker, Michigan
Contact:

Post by Ultramundane »

I am trying to load balance the network IO across multiple network interfaces that are on the same AIX SMP server. I put my APT_CONFIG_FILE in a previous post. Please look at the file and offer any suggestions. I would be happy with round robin approach. That is, one config file for this network interface and another config file for that network interface. That too, did not work though. Same error messages when I try a network interface that did not start the service.


TX.
track_star
Participant
Posts: 60
Joined: Sat Jan 24, 2004 12:52 pm
Location: Mount Carmel, IL

Post by track_star »

Ultra--

I understand that you're trying to use both network interfaces on a single SMP. But....... What network resources are you attempting to reach? Is there a database or file somewhere on the network you're wanting to use?

As for the thought of using multiple config files in a round robin approach for a single job, this not possible, as you can only call one config file per job at runtime. If you wanted to physically partition your data and call the job multiple times each with its own config file, that would be possible. Also, as you have already seen, the conductor node has to be the equivalent of the uname -n result. If not, the engine can't figure out where all of the runtime files are located (local node or remote node).

--ts
Ultramundane
Participant
Posts: 407
Joined: Mon Jun 27, 2005 8:54 am
Location: Walker, Michigan
Contact:

Post by Ultramundane »

Argh, it would not be for the same job. The first job uses file 1, the second job uses file 2, the third job uses file1.

Thus,
Each job would use a file according to its natural number ordering for when it is deployed.
You know,
if x%2 = 1 then use file 1, if x % 2 = 0 then use file 2. Something simple like that.

I understand that you're trying to use both network interfaces on a single SMP. But....... What network resources are you attempting to reach? Sybase, Oracle, MSSQL, DB2/UDB AIX, DB2 OS390, Supra, Total, Teradata. Is there a database or file somewhere on the network you're wanting to use? Not in general yet.

This is a question in general.
track_star
Participant
Posts: 60
Joined: Sat Jan 24, 2004 12:52 pm
Location: Mount Carmel, IL

Post by track_star »

Ultramundane wrote:Argh, it would not be for the same job. The first job uses file 1, the second job uses file 2, the third job uses file1.

Thus,
Each job would use a file according to its natural number ordering for when it is deployed.
You know,
if x%2 = 1 then use file 1, if x % 2 = 0 then use file 2. Something simple like that.
OK....thx for the clarification--in earlier posts it sounded like you were wanting to call multiple config files in the same job. If you want to call a config file based on job numbering or some sort of sequence, you can do that from a shell script. You'll have to use the API to start the jobs, but dsjob will accept params from the cmd line, so you could figure out a way of numbering or naming your config files and call them at runtime based on the job you're calling.
I understand that you're trying to use both network interfaces on a single SMP. But....... What network resources are you attempting to reach? Sybase, Oracle, MSSQL, DB2/UDB AIX, DB2 OS390, Supra, Total, Teradata. Is there a database or file somewhere on the network you're wanting to use? Not in general yet.

This is a question in general.
OK...again, thx for the clarification. As for this scenario, you need to keep in mind the reason that the scenario I presented earlier works. In a remote DB2 environment, you have to setup the nodes in a cluster. This takes some work, but you can definitely do this from a single serial number. Read the manual on copyorchdist, and this should show you how to setup alternate nodes in your environment as part of a cluster. This allows you to create a config file that calls the faster network adapters. HTH...
Ultramundane
Participant
Posts: 407
Joined: Mon Jun 27, 2005 8:54 am
Location: Walker, Michigan
Contact:

Post by Ultramundane »

Hmmm. I cannot find any information on copyorchdist and I believe that I have searched all of the PDFs that were installed on my pc. Well, that is another issue. The lousy documentation and the lack of a good interface. I guess a nice interface like dynatext would have been too expensive. Anyways, do you know where I can find this documentation?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Orchestrate manuals are handed out (on CD) with every DS EE class, and can be downloaded from the eServices web site. Search on this forum to find the link.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Ultramundane
Participant
Posts: 407
Joined: Mon Jun 27, 2005 8:54 am
Location: Walker, Michigan
Contact:

Post by Ultramundane »

Thanks Ray. I found it. I had to search on copy-orchdist and not copyorchdist. I will do some reading like track_star has suggested. However, I have been working with Ascential support now for about 2 weeks on this issue as well. They have no clue how to do it. At least I am getting some suggestions on this website.

Thanks.
Ultramundane
Participant
Posts: 407
Joined: Mon Jun 27, 2005 8:54 am
Location: Walker, Michigan
Contact:

Post by Ultramundane »

After slightly over two weeks Ascential finally said that it cannot be done. That is, a single SMP server with just one OS and multiple network interfaces cannot be load balanced for network traffic.
Post Reply