Page 1 of 1

Write to dataset on [fd 19] failed: Error 0

Posted: Fri Jan 19, 2007 4:09 am
by vij
Hi all,

I got the below fatal error and I understood that the Dataset when crosses 2GB space, writing data into it is not possible and the job gets failed. Is my understanding correct?

APT_CombinedOperatorController,1: Write to dataset on [fd 19] failed: Error 0
The error occurred on Orchestrate node node2 (hostname XXXXXXXX)
Also i searched in the forum and came to know that I have to give these details as an additional information, to have better idea.
ulimit -a
core file size (blocks) unlimited
data seg size (kbytes) unlimited
file size (blocks) unlimited
open files 512
pipe size (512 bytes) 10
stack size (kbytes) 8192
cpu time (seconds) unlimited
max user processes 29995
virtual memory (kbytes) unlimited
If i use a sequential file instead of dataset, will the problem be solved?

or whats the soultion for this problem??

Thanks in advance!!!

Posted: Fri Jan 19, 2007 4:17 am
by ArndW
If your OS supports files larger than 2Gb and you set the process limits to not cap files at 2Gb then you will not hit this problem. What settings do you actually have at present for the "file size"?

Using a sequential file instead of a data set while keeping the 2Gb limit will actually let you store less data - since each physical file in a data set can reach 2Gb.

Posted: Fri Jan 19, 2007 5:33 am
by ray.wurlod
Further, a Data Set can have multiple physical files (segments) per node. So 2GB is not an issue.

To perform more detailed diagnosis you will need to disable operator combination, either at the job level (via APT_DISABLE_COMBINATION environment variable) or at the stage level (via Advanced tab on stage properties). Only in this way can you reliably ascertain the stage in which the problem occurred.

Alas, Error 0 is not very informative - it's supposed to mean "success".

Check that you (the user ID under which jobs run) have write permission to every directory mentioned in the resources in your configuration file.

Posted: Fri Jan 19, 2007 7:37 am
by chulett
'Error 0' is kind of a catch-all unknown error on some systems, from what I recall. Still, as you said, not very informative. :wink: