Transform processes randomly hanging.
Posted: Thu Jul 27, 2000 5:23 am
I have an intermittent problem.
Some time back, (pre Informix days) I spent a fair bit of time looking into this, with the assistance of Ardent. We never did resolve it, but it became much less frequent, so I have not pursued it since. (More on this later).
The symptoms are as follows:
Within a job, one or more active stages will have a status of starting. It does not matter how long you leave the job, nothing changes. These jobs will abort if I stop them from Director. There is no consistency as to which jobs will do this or which stage(s) of which jobs. I have personally only seen this problem on an active stage whose primary input is from ODBC. Another developer on our team claims to have had this problem where the primary input was ORAOCI8, but I cannot positively confirm this. I can say that it has happened with different ODBC drivers. I have seen it with the Microsoft Visual Foxpro and the Microsoft SQL Server ODBC drivers.
The problem is not predictably repeatable, although seems to only occur when our server is very busy.
What we have figure out:
It would appear that the process for the job attempts to start the processes for each active stage but some of these processes either do not start or die suddenly as soon as they have started. These processes manage to create a work file in &PH& but do not write anything to it.
I originally had the problem when reading from some dBase IV format files using the MS Visual Foxpro ver 5 ODBC driver (note that the usual dBase driver does not handle long file names). Based on suspicions (now why would I think that about MS software) that it was to do with ODBC, I upgraded to version 6 of the Visual Foxpro driver. This dramatically reduced the problem, although it did not go away completely. It has only been happening, say, once per month (which isnt bad as we run about 300 jobs every night).
Unfortunately the problem has been happening a bit more often lately.
* News Flash * Another job has just been discovered hung as I type this. This job is reading from ORAOCI stages (not ORAOCI8). So much for my theory about only being ODBC.
Im not really expecting anyone to solve this problem and as Ardent and I have been over this in detail, Id be surprised (but thankful) if anyone came up with something new that we havent already looked at.
What I really want to know is HAS THIS HAPPENED TO ANYONE ELSE??? Or am I alone with this problem?
David Barham
Information Technology Consultant
CoalMIS Project
Shell Coal Pty Ltd
Brisbane, Australia
*************************************************************************
This e-mail and any files transmitted with it may be confidential and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in
error, please notify the sender by return e-mail, and delete this e-mail from your in-box. Do not copy it to anybody else
*************************************************************************
Some time back, (pre Informix days) I spent a fair bit of time looking into this, with the assistance of Ardent. We never did resolve it, but it became much less frequent, so I have not pursued it since. (More on this later).
The symptoms are as follows:
Within a job, one or more active stages will have a status of starting. It does not matter how long you leave the job, nothing changes. These jobs will abort if I stop them from Director. There is no consistency as to which jobs will do this or which stage(s) of which jobs. I have personally only seen this problem on an active stage whose primary input is from ODBC. Another developer on our team claims to have had this problem where the primary input was ORAOCI8, but I cannot positively confirm this. I can say that it has happened with different ODBC drivers. I have seen it with the Microsoft Visual Foxpro and the Microsoft SQL Server ODBC drivers.
The problem is not predictably repeatable, although seems to only occur when our server is very busy.
What we have figure out:
It would appear that the process for the job attempts to start the processes for each active stage but some of these processes either do not start or die suddenly as soon as they have started. These processes manage to create a work file in &PH& but do not write anything to it.
I originally had the problem when reading from some dBase IV format files using the MS Visual Foxpro ver 5 ODBC driver (note that the usual dBase driver does not handle long file names). Based on suspicions (now why would I think that about MS software) that it was to do with ODBC, I upgraded to version 6 of the Visual Foxpro driver. This dramatically reduced the problem, although it did not go away completely. It has only been happening, say, once per month (which isnt bad as we run about 300 jobs every night).
Unfortunately the problem has been happening a bit more often lately.
* News Flash * Another job has just been discovered hung as I type this. This job is reading from ORAOCI stages (not ORAOCI8). So much for my theory about only being ODBC.
Im not really expecting anyone to solve this problem and as Ardent and I have been over this in detail, Id be surprised (but thankful) if anyone came up with something new that we havent already looked at.
What I really want to know is HAS THIS HAPPENED TO ANYONE ELSE??? Or am I alone with this problem?
David Barham
Information Technology Consultant
CoalMIS Project
Shell Coal Pty Ltd
Brisbane, Australia
*************************************************************************
This e-mail and any files transmitted with it may be confidential and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in
error, please notify the sender by return e-mail, and delete this e-mail from your in-box. Do not copy it to anybody else
*************************************************************************