job aborted for Not enough space on node frs-diis

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
xiaying
Participant
Posts: 3
Joined: Tue May 11, 2010 6:49 am

job aborted for Not enough space on node frs-diis

Post by xiaying »

Hi
I migrate one jobs from one project to another project, and there two projects locate on same server,for the sake of convenience. we called job before migrate as JOB1 and after migrate as JOB2. JOB2 was aborted,the information of error is:

Tfm_Separate,1: Fatal Error: APT_Communicator::shmemInitBuffers: createMapFile (/opt/IBM/InformationServer/Server/Scratch/tmp/apTvS30412230412230356f36) failed: Not enough space on node frs-diis
The most likely reasons for this are a system-wide limit on the number of mmap'ed shared memory segments or that you have specified scratch directories or a TMPDIR that is not local to this node (e. g. is an NFS file system).


Lkp_COUNTRY.Lnk_NonCountry_Sort,3: Fatal Error: Unable to initialize communication channel on frs-diis. This is typically caused by a configuration problem. Examples of typical problems include:
1) The temporary directory, identified by $TMPDIR and/or the scratch disks in your ORCHESTRATE configuration, is located on a non-local file system (e. g. mounted over NFS).
2) The temporary directory is located on a file system with insufficient space.


df
Filesystem 512-blocks Free %Used Iused %Iused Mounted on
/dev/fslv05 41943040 9064408 79% 128436 12% /opt/IBM/InformationServer
/dev/fslv04 41943040 41933800 1% 25 1% /opt/IBM/InformationServer/Server/Datasets
/dev/dsscratch 146800640 144146992 2% 34 1% /opt/IBM/InformationServer/Server/Scratch
/dev/fslv07 6291456 6134952 3% 942 1% /var/mqm
/dev/fslv08 16777216 13496472 20% 46 1% /home/db2inst1/frs11/logs
/dev/fslv09 83886080 59641720 29% 1215 1% /files
/dev/fslv06 52428800 52420048 1% 10 1% /home/db2inst1/frs11/tempspace
/dev/fslv10 16777216 13496472 20% 46 1% /home/db2inst1/frs12/logs
/dev/fslv11 52428800 52420048 1% 10 1% /home/db2inst1/frs12/tempspace


ulimit -a
core file size (blocks) 1048575
data seg size (kbytes) unlimited
file size (blocks) unlimited
max memory size (kbytes) unlimited
open files unlimited
pipe size (512 bytes) 64
stack size (kbytes) unlimited
cpu time (seconds) unlimited
max user processes 500
virtual memory (kbytes) unlimited

but, the JOB1 run normally.
if anyone can help me?
Last edited by xiaying on Tue May 11, 2010 7:43 am, edited 4 times in total.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Gack, edit your post and get rid of the blue font, that's unreadable. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
nagarjuna
Premium Member
Premium Member
Posts: 533
Joined: Fri Jun 27, 2008 9:11 pm
Location: Chicago

Post by nagarjuna »

Job got aborted due to lack of space . Please note that after the job aborted or completed all the scratch space related to that job will be cleaned up .
Nag
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Right, not sure what advice you are looking for here other than either add more space, run fewer jobs at the same time or make them more efficient in their scratch space utilization, if at all possible.
-craig

"You can never have too many knives" -- Logan Nine Fingers
xiaying
Participant
Posts: 3
Joined: Tue May 11, 2010 6:49 am

Post by xiaying »

nagarjuna wrote:Job got aborted due to lack of space . Please note that after the job aborted or completed all the scratch space related to that job will be cleaned up .

Thanks for you replay, but I have clear up the scratch directory, and it is not work up, the job still aborted. and I sew one solution about not enough space (viewtopic.php?t=104125),
it is similar to this problem, but I think the difference between them more then similarity.

That problem is "KP,1: Could not map table file "/path/xx/lookuptable.20061006.drmm0ac (size 552711192 bytes)": Not enough space Error finalizing / saving table /path/xx/ds_temp/dynLUT143294a5578e98 "

and My problem is "APT_CombinedOperatorController(0),0: Fatal Error: APT_Communicator::shmemInitBuffers: createMapFile (/opt/IBM/InformationServer/Server/Scratch/tmp/apTvS340870340870c6f1f519) failed: Not enough space on node frs-diis"

so if anybody can help to resolve this problem? or tell me the difference between those?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Are you on a 64bit or 32bit server? Also, set $APT_DISABLE_COMBINATION to True and rerun to see exactly where the error message is coming from.
-craig

"You can never have too many knives" -- Logan Nine Fingers
D0n1117
Premium Member
Premium Member
Posts: 11
Joined: Sun Dec 19, 2010 1:49 pm
Location: VA

Post by D0n1117 »

I had the same problem with the same error with a lookup stage with 5 sequential files (4 as lookups). Plenty of scratchdisk, space was not an issue as the error indicates. All partitioning was set to Auto. I changed the partitioning to "Same" for the input and "Entire" for the lookups and now it works. Explicitly set your partitions on your input and see if that works.
Don
DataStage Developer
Post Reply