NODES AND SCRATCH ON SAN DISKS

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
srireddypunuru
Premium Member
Premium Member
Posts: 40
Joined: Thu Jul 10, 2008 12:45 pm

NODES AND SCRATCH ON SAN DISKS

Post by srireddypunuru »

Team,

We are running IS 8.5 on windows 4 CPU box. Where we have D and E Drives are SAN and D is the place we have installed IS 8.5

E Drive has 200GB free space where pointed our CONFIG FILE with 4 Nodes.

Issues we are facing -

1) Lots of OSH being created jobs failing - Unable to allocate resources etc.
2) IBM gave us a patch JR41358_PXE_windows_8501

They are suggesting to reduce the nodes to a 2 NODE CONFIG FILE

Below is our config file. Our Job designs arnt that complex but redcing the nodes to 2 will increase the execution time of the jobs.

Any thought our situation really appreciated.

main_program: APT configuration file: D:/IBM/InformationServer/Server/Configurations/default.apt

Code: Select all

{	node "node1"
	{
		fastname "ms"
		pools ""
		resource disk "E:/IBM/InformationServer/Server/Datasets" {pools ""}
		resource scratchdisk "E:/IBM/InformationServer/Server/Scratch" {pools ""}
	}

	node "node2"
	{
		fastname "ms979"
		pools ""
		resource disk "E:/IBM/InformationServer/Server/Datasets" {pools ""}
		resource scratchdisk "E:/IBM/InformationServer/Server/Scratch" {pools ""}
	}
	node "node3"
	{
		fastname "ms979"
		pools ""
		resource disk "E:/IBM/InformationServer/Server/Datasets" {pools ""}
		resource scratchdisk "E:/IBM/InformationServer/Server/Scratch" {pools ""}
	}
	node "node4"
	{
		fastname "ms979"
		pools ""
		resource disk "E:/IBM/InformationServer/Server/Datasets" {pools ""}
		resource scratchdisk "E:/IBM/InformationServer/Server/Scratch" {pools ""}
	}
}
o
Srikanth Reddy
Integration Consultant
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

The key would be finding out what resources it is unable to allocated.

Using SAN should not be an issue.

If you're disk is full, the error makes sense. If not, it could be referring to memory. Do you have a lot of old, large files filling up your 200GB?

Find out why they recommend 2 nodes.

Are there any clues in the README from the patch they gave you?
Choose a job you love, and you will never have to work a day in your life. - Confucius
srireddypunuru
Premium Member
Premium Member
Posts: 40
Joined: Thu Jul 10, 2008 12:45 pm

NODES AND SCRATCH ON SAN DISKS

Post by srireddypunuru »

Thanks Eric,

1) I have cleaned up the disk out of 200GB i have 150 GB Free.

Please find read me file from PATCH.
PATCH FOR APAR : JR41358
PATCH NAME : patch_JR41358_PXE_windows_8501
ENGINEERING TEAM : Parallel Framework
COMPONENT : PXEngine
TIERS : Engine
OPERATING SYSTEM : windows 32 & 64 bit
SUITE VERSION* : 8.5.0.1
UNINSTALL** : Supported
RECOMPILE JOBS : None Required

* This patch requires that IBM Information Server suite and component be
installed at the exact level shown and no other.

** If the patch can be uninstalled (see above) and you need to uninstall it,
see the patch installation instructions for information on uninstalling.

PROBLEM:
Intermittent job failure when using shared memory for interprocess communication.
Jobs fail with the following fatal error:
Unable to initialize communication channel on XXXX. This is typically caused by a
configuration problem. Examples of typical problems include:
1) The temporary directory, identified by $TMPDIR and/or the scratch disks in your
ORCHESTRATE configuration, is located on a non-local file system (e. g. mounted over NFS).
2) The temporary directory is located on a file system with insufficient space.

RESOLUTION:
Fixed an issue with shared memory file name string handling.
I have a 4 CPU and they recommonded 2 NODE syaing it is a thumb rule for N CPU to hve N-2 nodes in general for a small Development systems like ours.

Thanks
Sri
Cincinnati OH
Srikanth Reddy
Integration Consultant
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

How much RAM do you have? Are you doing RAM intensive jobs such as large Lookups? Are you running low volume jobs against that config file? That results in a hell of a lot of useless data partitioning and re partitioning and lots of processes you don't need. Consider having a single node config file that is used for low volume jobs - that may free up resources to run your high volume jobs across four nodes. I would expect that only a small percentage of your jobs need to run on a 4 node config.
srireddypunuru
Premium Member
Premium Member
Posts: 40
Joined: Thu Jul 10, 2008 12:45 pm

NODES AND SCRATCH ON SAN DISKS

Post by srireddypunuru »

IBM Team was suggesting to use Create 4 Mount Points on 4 LUNs

Code: Select all

{
node "node1"                                                  SAN
 {
                           
     fastname "servername"
     pools ""
     resource disk "/datasets/d1" {pools ""}   -------------> MOUNT POUNT 1           
     resource Scratchdisk "/scratch/s1" {pools ""}  -------->  MOUNT POINT 2
}
node "node2"
 {

     fastname "servername"
     pools ""
     resource disk "/datasets/d2" {pools ""}   ---------> MOUNT POINT 3
     resource Scratchdisk "/scratch/s2" {pools ""}---------> MOUNT POINT 4
}
nOT sURE WHAT THEY MEAN Vincent can you throw som light on this.
Srikanth Reddy
Integration Consultant
kwwilliams
Participant
Posts: 437
Joined: Fri Oct 21, 2005 10:00 pm

Re: NODES AND SCRATCH ON SAN DISKS

Post by kwwilliams »

Take a step back and look at your development environment as a whole -- is the setup adequate to meet your needs and was it architected correctly?

1. How many developers do you have workign concurrently (don't care about total number, but how many are working at the same time)?

2. How many jobs do you have currently? How many jobs do you anticipate will be created per month?

3. Is this happening in all of your environments or just one in particular? Your last note seems to state this is a development environment, which is why I am asking -- I size them differently because they have different needs.

4. Piggy backing on Vincent's comments - do you have large normal lookups in your jobs that require a lot of memory?

5. Do you have large sorts that require copious amounts of scratch space?

6. Do you have large datasets that require large amounts of resource disk?

The questions could go on, those are the things that I look for when getting the approximate size of a system. You could also ask your IBM representative to bring someone in to evaluate your needs.

I'm just outside of Cincinnati - Let me know if you can't clear your issues up. I would be happy to talk by phone - or drop by at some point to discuss.
kwwilliams
Participant
Posts: 437
Joined: Fri Oct 21, 2005 10:00 pm

Re: NODES AND SCRATCH ON SAN DISKS

Post by kwwilliams »

Think of a LUN as a device. They want your storage admins to create four different LUNS (think devices) that you will then mount to your Windows server. The idea is that with four different devices you will not have contention between the scratch (node1), scratch (node2), resource(node1), and resource (node2) which will make your system faster.

I doubt it is causing your allocation issue.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I reckon the second suggestion in the error message warrants closer examination. What is your temporary directory? How much free space exists on its file system?

Change the value of TMPDIR environment variable so that it points to a directory on a file system with lots more space than that of /tmp.

If you're running server jobs, also change UVTEMP in the uvconfig file and regen the shared memory image.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

I'm just outside of Cincinnati too (about an hour). :o

Check TMPDIR like Ray and the README both suggested. The paths in your config file are not the only paths used by the application when creating temporary files.

Also install the patch if you can, then give an update.
Choose a job you love, and you will never have to work a day in your life. - Confucius
Post Reply