Page 1 of 1

Parallel job getting aborted due to APT_BufferOperator

Posted: Mon Dec 14, 2009 7:47 am
by ketanshah123
Hi All

One of the parallel job gettign qbporetd with the following error :

buffer(5),3: APT_BufferOperator: Add block to queue failed. This means that your buffer filesystems all ran out of file space, or that some other system error occurred. Please ensure that you have sufficient scratchdisks in either the default or "buffer" pools on all nodes in your configuration file. [iomgr/bufferop.C:1397]


How to remove this error ? Any suggestion

Thanx in advance

Posted: Mon Dec 14, 2009 8:07 am
by chulett
"gettign qbporetd"? :?

First suggestion?

"Please ensure that you have sufficient scratchdisks in either the default or "buffer" pools on all nodes in your configuration file"

Posted: Mon Dec 14, 2009 8:11 am
by ketanshah123
{
node "node1"
{
fastname "natsci163"
pools ""
resource disk "/data01/Datasets" {pools ""}
resource scratchdisk "/data01/Scratch" {pools ""}
}
node "node2"
{
fastname "natsci163"
pools ""
resource disk "/data01/Datasets" {pools ""}
resource scratchdisk "/data01/Scratch" {pools ""}
}
node "node3"
{
fastname "natsci163"
pools ""
resource disk "/data01/Datasets" {pools ""}
resource scratchdisk "/data01/Scratch" {pools ""}
}
node "node4"
{
fastname "natsci163"
pools ""
resource disk "/data01/Datasets" {pools ""}
resource scratchdisk "/data01/Scratch" {pools ""}
}
}


This is my configuration file and have checked data01/Scratch and /data01/Datasets are just 50 % used

/data02/target $ cd /data01/Datasets
/data01/Datasets $ df .
Filesystem 512-blocks Free %Used Iused %Iused Mounted on
/dev/data01lv01 34078720 17176064 50% 50284 3% /data01
/data01/Datasets $ cd /data01/Scratch
/data01/Scratch $ df .
Filesystem 512-blocks Free %Used Iused %Iused Mounted on
/dev/data01lv01 34078720 17176064 50% 50284 3% /data01
/data01/Scratch $

Posted: Mon Dec 14, 2009 8:21 am
by chulett
You need to monitor the space while the job is running.

Posted: Mon Dec 14, 2009 9:01 am
by ketanshah123
which space I need to monitor
where file is getting created
or

resource disk "/data01/Datasets" {pools ""}

or

resource scratchdisk "/data01/Scratch" {pools ""}

Posted: Mon Dec 14, 2009 9:02 am
by chulett
Which was mentioned in your error message?

Posted: Mon Dec 14, 2009 9:04 am
by ketanshah123
ok got it ... i need to monitor

sufficient scratchdisks in either the default or "buffer" pools