run time fatal error : Player 12 terminated unexpectedly

kumar_s · Post by **kumar_s** » Thu Jul 13, 2006 11:40 pm

mctny wrote: I doubt this problem really related with APT_MONITOR_TIME or APT_MONITOR_SIZE? I mean I didnot understand why they these causes job failures, There are not many users connected to the UNIX box at the time of the runs happening. it could be a datastage bug, or a unix issue or something not related to DS.

yesterday I set the MONITOR_TIME parameter to nothing and ALL the jobs failed yesterday., the error was again the same for most i.e., Player terminated unexpectedly. one of the error was different, I will post a new topic for that. it is a sigsegv error

Have you tried DISABLE_JOBMON = True.

mctny · Post by **mctny** » Fri Jul 14, 2006 7:56 am

Hi Kumar,

IBM also suggested the same thing as you are? no I haven't made that change in our production environment because the jobs are running fine recently, since I cleaned up the resources and I am not getting player termination error.

what are the consequences of doing that, do you have any idea?

Thanks
cetin

kumar_s · Post by **kumar_s** » Fri Jul 14, 2006 8:09 pm

You cannot straight away goahead and clean up all the resource displayed in the DS director. If your are sure about nothing is been used or no job is running, it is better to restart the server once.
There is a bug in the (Java code) Job monitor. As you might have aware, by canging the time based moniter to size based monitor, you can mitigate the problem. But as a worst condition you can turn off the monitor.
By doing this, you wont be able to see the color chagne in the each job. Neither the rows/sec statistics.
But you can also handle this problem, if you plan the load of the server and distribute it across eventually.

mctny · Post by **mctny** » Fri Jul 14, 2006 10:33 pm

kumar_s wrote:You cannot straight away goahead and clean up all the resource displayed in the DS director. If your are sure about nothing is been used or no job is running, it is better to restart the server once.
There is a bug in the (Java code) Job monitor. As you might have aware, by canging the time based moniter to size based monitor, you can mitigate the problem. But as a worst condition you can turn off the monitor.
By doing this, you wont be able to see the color chagne in the each job. Neither the rows/sec statistics.
But you can also handle this problem, if you plan the load of the server and distribute it across eventually.

I knew that no job were running, in fact I am the only person who deals with DataStage or ETL stuff.

being not able to see the job statistics is also not good for us, we need those sometimes. IBM didnot mention about any patch although I asked them, maybe that person dealing with us was not aware there is a patch for this issue. right now everyone is happy since no jobs are failing recently, if they fails due to player termination again, I will apply your suggestion

Thanks
cetin

kumar_s · Post by **kumar_s** » Fri Jul 14, 2006 11:36 pm

Also make a note that, if any job is making use of LinkInfo function will also enventually fail. Because, information of LinkInfo is obtained from Monitor.
You can ask for ecase directly. Some one who dealt with this can provide it, else I ll search and provide the same.

mctny · Post by **mctny** » Sat Jul 15, 2006 7:02 am

kumar_s wrote:Also make a note that, if any job is making use of LinkInfo function will also enventually fail. Because, information of LinkInfo is obtained from Monitor.
You can ask for ecase directly. Some one who dealt with this can provide it, else I ll search and provide the same.

is it the name of the patch? I mean "ecase"

kumar_s · Post by **kumar_s** » Sun Jul 16, 2006 2:08 am

No ecase is some thing like serial number given for each case logged against the product. So ther should be some number, say ecasennnn which will denote the problem for this player temination, logged by someone.
I started searching, but could not find in my new PC. But it should be with my bacukup.

mctny · Post by **mctny** » Tue Jul 18, 2006 9:04 pm

kumar_s wrote:No ecase is some thing like serial number given for each case logged against the product. So ther should be some number, say ecasennnn which will denote the problem for this player temination, logged by someone.
I started searching, but could not find in my new PC. But it should be with my bacukup.

yes, IBM assigned us an ecase number and our case is still pending, since we didnot apply their suggestion (setting Disable_JobMon to True) , the ball is in our court right now. if we apply their suggestion, someof our jobs will fail because they use linkinfo function. and so far our jobs run perfectly since I cleanup resources.

so my dilemma is should I apply the suggested change which will cause our jobs to fail just to learn how to solve the player termination issue with IBM or should I wait and see which might also cause IBM to close the case?
isn't it a weird situation?

pigpen · Post by **pigpen** » Tue Jul 18, 2006 11:49 pm

Did you try the followings?

1. enlarge the scratchdisk size
2. enlarge the TmpDir size