Hi,
I have a parallel job and the job is running fine if I don't sort the data and If I sort the data then the job is aborting with the below error
APT_CombinedOperatorController,1: Fatal Error: Tsort merger aborting: Scratch space full
I checked the space that I'm using
$ df -P /opt/IBM/InformationServer/Server/Scratch
Filesystem 512-blocks Used Available Capacity Mounted on
/dev/rx/dsk/root/IBMvol 40536164 19359184 21176980 48% /opt/IBM
and it is showing 48% available and still my job is aborting and below is my default configuration file that I'm using
main_program: APT configuration file: /opt/IBM/InformationServer/Server/Configurations/default.apt
{
node "node1"
{
fastname "etldev01"
pools ""
resource disk "/opt/IBM/InformationServer/Server/Datasets" {pools ""}
resource scratchdisk "/opt/IBM/InformationServer/Server/Scratch" {pools ""}
}
node "node2"
{
fastname "etldev01"
pools ""
resource disk "/opt/IBM/InformationServer/Server/Datasets" {pools ""}
resource scratchdisk "/opt/IBM/InformationServer/Server/Scratch" {pools ""}
}
node "node3"
{
fastname "etldev01"
pools ""
resource disk "/opt/IBM/InformationServer/Server/Datasets" {pools ""}
resource scratchdisk "/opt/IBM/InformationServer/Server/Scratch" {pools ""}
}
node "node4"
{
fastname "etldev01"
pools ""
resource disk "/opt/IBM/InformationServer/Server/Datasets" {pools ""}
resource scratchdisk "/opt/IBM/InformationServer/Server/Scratch" {pools ""}
}
}
I taught the problem is with the disk space but I can still see 48% left. Any help let me know.
Thanks,
Somaraju
Tsort merger aborting: Scratch space full Error
Moderators: chulett, rschirm, roy
Craig is correct - kind of
There is space there as you can clearly see from the df command, but when your job runs the sort operation consumes what available disk space is present. When the job aborts the temp sort files are removed which makes you believe that you have space.
Also, I am not sure that your current resource disk and scratch disk location is probably the best idea. These should probably be located on a mount that is separate from the engine install.
Another thing you might want to do to prevent an abort due to space issues would be to create a configuration like the following -
The resource disk will be used in order of entry meaning that when Resource_Disk_1 fills up Resource_Disk_Overflow is there to help out.
The Resource Scratch_1 and 2 are used in a round robin manner, but since there are 2 and they should be on separate controllers you should see improved performance especially if the disk are raid0 (mirrored).
There is space there as you can clearly see from the df command, but when your job runs the sort operation consumes what available disk space is present. When the job aborts the temp sort files are removed which makes you believe that you have space.
Also, I am not sure that your current resource disk and scratch disk location is probably the best idea. These should probably be located on a mount that is separate from the engine install.
Another thing you might want to do to prevent an abort due to space issues would be to create a configuration like the following -
Code: Select all
{
node "node1"
{
fastname "etldev01"
pools ""
resource disk "/opt/Resource_Disk_1" {pools ""}
resource disk "/opt/Resource_Disk_Overflow" {pools ""}
resource scratchdisk "/opt/Resource_Scratch_1" {pools ""}
resource scratchdisk "/opt/Resource_Scratch_2" {pools ""}
}
node "node2"
{
fastname "etldev01"
pools ""
resource disk "/opt/Resource_Disk_2" {pools ""}
resource disk "/opt/Resource_Disk_Overflow" {pools ""}
resource scratchdisk "/opt/Resource_Scratch_1" {pools ""}
resource scratchdisk "/opt/Resource_Scratch_2" {pools ""}
}
The Resource Scratch_1 and 2 are used in a round robin manner, but since there are 2 and they should be on separate controllers you should see improved performance especially if the disk are raid0 (mirrored).
Mike Hester
mhester@petra-ps.com
mhester@petra-ps.com