Random abnormal termination of jobs
Moderators: chulett, rschirm, roy
Random abnormal termination of jobs
Hi,
We have just recently upgraded to DS 7.5 on a Solaris 2.8 SUN server.
We are having problems with jobs (not any job in particular) aborting with "Abnormal termination of stage P331LoadAssetSourceNEMS00..Transform detected". This can happen to just about any jobs but one observation is that it tends to happen when a high number of jobs are run concurrently. These same jobs will run successully most of the time. The problem can occur in any of the jobs and not limited to one job. Also these jobs are very simple ETL jobs that reads from a source and write out sequential files.
Has anyone else had the same problem ?
Let me know if you need more information.
Nick
We have just recently upgraded to DS 7.5 on a Solaris 2.8 SUN server.
We are having problems with jobs (not any job in particular) aborting with "Abnormal termination of stage P331LoadAssetSourceNEMS00..Transform detected". This can happen to just about any jobs but one observation is that it tends to happen when a high number of jobs are run concurrently. These same jobs will run successully most of the time. The problem can occur in any of the jobs and not limited to one job. Also these jobs are very simple ETL jobs that reads from a source and write out sequential files.
Has anyone else had the same problem ?
Let me know if you need more information.
Nick
Re: Random abnormal termination of jobs
We have also faced such issue. We have an open ticket with Ascential on this. To fix this at our end, we have introduced a 30 second delay in the shell script that calls the datastage job.chinek wrote:Hi,
We have just recently upgraded to DS 7.5 on a Solaris 2.8 SUN server.
We are having problems with jobs (not any job in particular) aborting with "Abnormal termination of stage P331LoadAssetSourceNEMS00..Transform detected". This can happen to just about any jobs but one observation is that it tends to happen when a high number of jobs are run concurrently. These same jobs will run successully most of the time. The problem can occur in any of the jobs and not limited to one job. Also these jobs are very simple ETL jobs that reads from a source and write out sequential files.
Has anyone else had the same problem ?
Let me know if you need more information.
Nick
Make sure your T30FILES setting is high enough to support the number of jobs executing simultaneously. Your problem is a common one. Search the forum for discussions about the UVCONFIG file and recommended settings. The abnormal terminations can be related to not enough internal pointers available to address all of the open hash files (jobs have log, status, config, and other dynamic hash files open).
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Hi
T30FILE is set to 2048 in this server but problem is still occuring. What I have seen is that it seems to happen less often if the jobs are run sequentially from the job control as opposed to running the jobs in parallel through the job control.
But that is not good for us because then some of the batches will simply take too long to complete.
Thanks for your suggestion though.
Nick
T30FILE is set to 2048 in this server but problem is still occuring. What I have seen is that it seems to happen less often if the jobs are run sequentially from the job control as opposed to running the jobs in parallel through the job control.
But that is not good for us because then some of the batches will simply take too long to complete.
Thanks for your suggestion though.
Nick
Release 5.x running on Sun 2.8 had issues that required a Sun patch and a DS patch that was characterized by random abnormal terminations under heavy system load. Release 6+ incorporated the DS side fixes, but the Sun patch I believe is still required. You may consider contacting tech support and verifying that the patch set and kernel parameters on your machine are what they need to be.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
For a leap of faith. Change your dsenv in the following way:
Add "/usr/lib/lpw" in front of the LD_LIBRARY_PATH, restart DS and your problem will magically disappear![Wink :wink:](./images/smilies/icon_wink.gif)
So it should be something as
LD_LIBRARY_PATH=/usr/lib/lwp:...
This is a work around for a known thread problem in Solaris/DataStage.
Ogmios
Add "/usr/lib/lpw" in front of the LD_LIBRARY_PATH, restart DS and your problem will magically disappear
![Wink :wink:](./images/smilies/icon_wink.gif)
So it should be something as
LD_LIBRARY_PATH=/usr/lib/lwp:...
This is a work around for a known thread problem in Solaris/DataStage.
Ogmios
In theory there's no difference between theory and practice. In practice there is.
what about DS6 on aix 5
Any idea if there is a similar fix for the AIX ?
Double that on the Windows version. We've been experiencing random terminations for the last 8 months. Last night 3 jobs failed with this error or similar:
jb000PlyrSessionGroupALL6.9.Copy_of_Link_Partitioner_62.ww: ds_ipcopen() - call to OpenFileMapping() failed - The system cannot find the file specified.
jb000PlyrSessionGroupALL6.9.Copy_of_Link_Partitioner_62.ww: ds_ipcopen() - call to OpenFileMapping() failed - The system cannot find the file specified.