Run Time errors

kiran · Post by **kiran** » Mon Oct 13, 2003 8:35 am

I am having the job failures consistently with the following FATAL errors. Could some one help me out with the solution. This is happening in Production.

FATAL :APT_CombinedOperatorController(4),4: Failure during execution of operator logic
FATAL : APT_CombinedOperatorController(4),4: mmap failed: Resource temporarily unavailable
FATAL : APT_CombinedOperatorController(6),1: Fork failed: Not enough space.

I changed the config file to use the /tmp space in Unix which is around 50GB. The monitoring of the process showed me that it is using upto 22 - 25 GB at the peak. Also the &PH& directory is not full. If I reset the job and run them they will finish sucessfully.
Did anyone had the same problem.
Thanks
Kiran.

bigpoppa · Post by **bigpoppa** » Wed Oct 15, 2003 2:18 pm

Sometimes, swap space is allocated out of /tmp space, and swapping and /tmp files can compete for the same disk resources. Fork might be failing b/c it's running out of swap space in /tmp.

Nothing in your config file should ever point to /tmp. Point your scratch/datasets to a different directory, and try running your job again.

Let us know if that helps.

- BP

ray.wurlod · Post by **ray.wurlod** » Wed Oct 15, 2003 5:37 pm

This is particularly true on Solaris, where swap is mounted on /tmp by default, and /tmp is pretty small by default. Usually some tweaking of the latter is necessary in any case, and using other disk space for application-based temp is highly desirable.

kiran · Post by **kiran** » Thu Oct 16, 2003 7:29 am

Thanks for you replies.
I had the swap space pointed to the scratchdisk on the Datastage mount point. That did not work and so I changed it to /tmp. This has 50GB of space and when I was monitoring the stats show that approx 22GB is used by solaris leaving the other 30GB for swap. This did not help either. I am looking for an alternative sol.
Thanks.

bigpoppa · Post by **bigpoppa** » Thu Oct 16, 2003 10:52 am

Kiran,

I just want to make sure I understand completely the problem completely.

1. Is the OS swap space pointing to /tmp?
2. How are you monitoring /tmp and swap usage?
3. Is the box quiet except for PX?
4. Can you do a 'ulimit -a' on your Solaris box and give us the results?

I like to use the top utility to monitor PX jobs on UNIX. If you have the 'top' utility on your machine, you can watch for a line like this when you're running your job and running top:

Memory: 32G real, 12G free, 20G swap in use, 49G swap free

If the 'swap free' stat starts to dwindle, you might end up with that fork/mmap problem.

-BP

kiran · Post by **kiran** » Thu Oct 16, 2003 1:51 pm

bigpoppa wrote:Kiran,

I just want to make sure I understand completely the problem completely.

1. Is the OS swap space pointing to /tmp?
2. How are you monitoring /tmp and swap usage?
3. Is the box quiet except for PX?
4. Can you do a 'ulimit -a' on your Solaris box and give us the results?

I like to use the top utility to monitor PX jobs on UNIX. If you have the 'top' utility on your machine, you can watch for a line like this when you're running your job and running top:

Memory: 32G real, 12G free, 20G swap in use, 49G swap free

If the 'swap free' stat starts to dwindle, you might end up with that fork/mmap problem.

-BP

BP,
The answers to your questions are
1. Yes
2. Running the top utility. (Not me but with the admin).
3. There are two projects on the box for PX. Except that it is quit.
4. I will try to do that and give you the resuls.

To your point of 'swap free' stat dwindling, yes it does when the jobs are running. I do not remember the exact numbers.
Hope this helps.
Thanks for your help.