Rare Error Messsage
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 1255
- Joined: Wed Feb 02, 2005 11:54 am
- Location: United States of America
Rare Error Messsage
Hi All,
When one of my jobs runs it aborts giving a 'Floating Point Exception' message. It also says before it that it is a Phantom.
Has anyone encountered such type of error message before?
Any tiny bit of insight or help is very much appreciated.
Thanks much,
Naveen.
When one of my jobs runs it aborts giving a 'Floating Point Exception' message. It also says before it that it is a Phantom.
Has anyone encountered such type of error message before?
Any tiny bit of insight or help is very much appreciated.
Thanks much,
Naveen.
Re: Rare Error Messsage
Naveen,
We got a similar error message when our Server was too busy to trigger a job. We did not face the problem after reducing the load on the server
Hope this helps
Yamini
We got a similar error message when our Server was too busy to trigger a job. We did not face the problem after reducing the load on the server
Hope this helps
Yamini
We are also facing the same error, when looking up from a hash file in a transformer stage. There are no divisions with numbers, all the data-types are set accordingly. There is only one constraint Not(IsNull()), and thats all in the transformer.
It occurs only with hash files. If pre-load to memory is disabled, works fine. If pre-load to memory is enabled, it fails.
The strange thing is, that those jobs failing now were running successfully for 1 year before.
What I can see, the hash files are not created correctly. It is not converted to data and over files, it creates separate files per record. The file names are the keys, and the content are the other fields. I guess somehow without any indication the hash file creation failes in DS.
There is no useful message in the log, neither in the &PH& directory.
Only these 2 warnings can be seen in the log:
This is from &PH&:
It occurs only with hash files. If pre-load to memory is disabled, works fine. If pre-load to memory is enabled, it fails.
The strange thing is, that those jobs failing now were running successfully for 1 year before.
What I can see, the hash files are not created correctly. It is not converted to data and over files, it creates separate files per record. The file names are the keys, and the content are the other fields. I guess somehow without any indication the hash file creation failes in DS.
There is no useful message in the log, neither in the &PH& directory.
Only these 2 warnings can be seen in the log:
Code: Select all
Attempting to Cleanup after ABORT raised in stage Jobname..Hashname
Code: Select all
Message: DataStage Job 973 Phantom 26173
Floating point exception
Attempting to Cleanup after ABORT raised in stage
DataStage Phantom Aborting with @ABORT.CODE = 3
Code: Select all
DataStage Job 974 Phantom 26150
Job Aborted after Fatal Error logged.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Clearly it's not rare.
Floating point exceptions ought not to occur with hashed files, though it is possible with corrupted hashed files (where the modulus calculation performs an improper division). Have you checked the structural integrity of the hashed file(s) that your job accesses?
Otherwise, where are you performing any floating point arithmetic?
![Razz :P](./images/smilies/icon_razz.gif)
Floating point exceptions ought not to occur with hashed files, though it is possible with corrupted hashed files (where the modulus calculation performs an improper division). Have you checked the structural integrity of the hashed file(s) that your job accesses?
Otherwise, where are you performing any floating point arithmetic?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
First thanks for you reply.
Unfortunately it is not rare![Sad :(](./images/smilies/icon_sad.gif)
And it happened this week in 2 different processes with 2 different hash files. I thought that the hash file is responsible as it looks ugly in the file system and disabling memory pre-load eliminates the problem.
But I would like to have the pre-load to be turned on in production.
Could you please explain how can I check the structural integrity of the hash file?
3 jobs are failing on the same hash file. When the hash file is created, it is supposed to be created as Type30(dynamic) and allow stage write cache is turned on.
The funny thing that reading from the hash file as a driver goes ok.
I mean the following job:
(corrupted(?)) hash file ->transformer -> seq file
^
|
another hash file
But this one fails:
oracle->transformer->seq file
^
|
(corrupted(?))hash file
Unfortunately the affected hash file in the unix directory does not look like a normal one. There separate files per each record, where the name is the key of the hash file.
We are not performing any mathematical operations, just simply looking up data from the hash file.
Unfortunately it is not rare
![Sad :(](./images/smilies/icon_sad.gif)
And it happened this week in 2 different processes with 2 different hash files. I thought that the hash file is responsible as it looks ugly in the file system and disabling memory pre-load eliminates the problem.
But I would like to have the pre-load to be turned on in production.
Could you please explain how can I check the structural integrity of the hash file?
3 jobs are failing on the same hash file. When the hash file is created, it is supposed to be created as Type30(dynamic) and allow stage write cache is turned on.
The funny thing that reading from the hash file as a driver goes ok.
I mean the following job:
(corrupted(?)) hash file ->transformer -> seq file
^
|
another hash file
But this one fails:
oracle->transformer->seq file
^
|
(corrupted(?))hash file
Unfortunately the affected hash file in the unix directory does not look like a normal one. There separate files per each record, where the name is the key of the hash file.
We are not performing any mathematical operations, just simply looking up data from the hash file.
I forget what silly type this is but it is one of the ways that a dynamic hashed file can 'corrupt' itself, especially if you are deleting and recreating it each time. The loss of the .Type30 file there for any reason can revert it back to this type where every record is a separate file. Best to remove the contents yourself so it can properly rebuild itself.TBartfai wrote:Unfortunately the affected hash file in the unix directory does not look like a normal one. There separate files per each record, where the name is the key of the hash file.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Deleting the hash file solves the problem, I forgot to mention that I have already tried
Why I am so interested in this case, because our customer does not accept this workaround, and keep insist on opening a defect on us and prevent this behaviour![Sad :(](./images/smilies/icon_sad.gif)
Do not you know accidentaly how to prevent this corruption?
BTW, you hit the nail on the head, we are always creating almost each hash file again and again every day.
Or moving to static hash file type can eliminate such a behaviour?
That's what I would not like, as this would affect all our processes and jobs![Sad :(](./images/smilies/icon_sad.gif)
Thanks for your reply and I would much appreciate any further help
![Embarassed :oops:](./images/smilies/icon_redface.gif)
Why I am so interested in this case, because our customer does not accept this workaround, and keep insist on opening a defect on us and prevent this behaviour
![Sad :(](./images/smilies/icon_sad.gif)
Do not you know accidentaly how to prevent this corruption?
BTW, you hit the nail on the head, we are always creating almost each hash file again and again every day.
Or moving to static hash file type can eliminate such a behaviour?
That's what I would not like, as this would affect all our processes and jobs
![Sad :(](./images/smilies/icon_sad.gif)
Thanks for your reply and I would much appreciate any further help
Stop.TBartfai wrote:BTW, you hit the nail on the head, we are always creating almost each hash file again and again every day.
![Wink :wink:](./images/smilies/icon_wink.gif)
While the norm is to rebuild the contents of hashed files run to run, there usually isn't an overwhelming need to delete and recreate them each run as well. Why not switch to simply clearing them each time?
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
![Confused :?](./images/smilies/icon_confused.gif)
The OP may be running into resource issues. I'm curious if there has been any attempt to 'tweak' the settings in the uvconfig file for DataStage to help with this? For example, flirting with the edge of this parameter could cause the issue you are seeing, I do believe:
Code: Select all
# T30FILE - specifies the number of
# dynamic files that may be opened.
# Used to allocate shared memory
# concurrency control headers.
T30FILE 200
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Again thanks for all your answers ![Smile :)](./images/smilies/icon_smile.gif)
We are not doing any manual or own developed hash file manipulation, we are using only built-in stage properties.
But I have checked the uvconfig files on our test, production-like and production servers and its value is either 2000 or 2050.
I do not exactly understand why was it set![Sad :(](./images/smilies/icon_sad.gif)
Maybe because we are using many hash files for lookups instead of DB lookups. There are also hash files around 3 GB
.
I know it is not recommended
, we are currently working on this issue to get rid of.
![Smile :)](./images/smilies/icon_smile.gif)
We are not doing any manual or own developed hash file manipulation, we are using only built-in stage properties.
But I have checked the uvconfig files on our test, production-like and production servers and its value is either 2000 or 2050.
I do not exactly understand why was it set
![Sad :(](./images/smilies/icon_sad.gif)
Maybe because we are using many hash files for lookups instead of DB lookups. There are also hash files around 3 GB
![Rolling Eyes :roll:](./images/smilies/icon_rolleyes.gif)
I know it is not recommended
![Smile :)](./images/smilies/icon_smile.gif)