Failed DB2 load - Failed opening fifo
Posted: Tue Aug 23, 2011 10:08 am
We have a job that reads from Teradata and writes directly to a remote DB2 database. If it works, it runs in 5-10 minutes. However, it often fails with the following:
When it does fail, it will run for 30-40 minutes before failing.
We increased the APT var from 600 seconds (10 minutes) to 1200 seconds (20 minutes) and that helped somewhat - it reduced how often the job fails.
I have been looking through other topics with this same error, but not having any luck yet. The job runs fine 8 out of 10 times, and for the remaining 2 times will eventually run successfully after 1 or more reruns.
I don't believe this is an issue of permissions or invalid directories, or this job would never run successfully.
Any suggestions?
Brad.
Our DB2 DBAs have confirmed that there were no deadlocks on the target, that the job timed out. Also, the target system is pretty quiet, so this isn't necessarily due to system overload.101885 FATAL Thu Jun 16 07:49:14 2011 DB2_Table_Write,9: SQLCODE = -911; SQLSTATE=40001^O?????Lx [db2load_driver.C:969]
101886 FATAL Thu Jun 16 08:09:06 2011 DB2_Table_Write,9: SQL0911N The current transaction has been rolled back because of a deadlock or timeout. Reason code "68". SQLSTATE=40001
101886 FATAL Thu Jun 16 08:09:06 2011 [db2load_driver.C:969]
101887 FATAL Thu Jun 16 08:09:06 2011 DB2_Table_Write,8: Failed opening fifo /u001/n2/2/ordb211964623c75942a_.009 after 1,200 seconds: No such device or address.
101887 FATAL Thu Jun 16 08:09:06 2011 Please review DB2 logs. You may consider increasing the time-out time by setting the environmental variable APT_DB2LOADER_TIMEOUT. [db2loader.C:999]
When it does fail, it will run for 30-40 minutes before failing.
We increased the APT var from 600 seconds (10 minutes) to 1200 seconds (20 minutes) and that helped somewhat - it reduced how often the job fails.
I have been looking through other topics with this same error, but not having any luck yet. The job runs fine 8 out of 10 times, and for the remaining 2 times will eventually run successfully after 1 or more reruns.
I don't believe this is an issue of permissions or invalid directories, or this job would never run successfully.
Any suggestions?
Brad.