Hello folks:
Is there any chance that hash file gets corrupted(hope OS ain't the source of corruption)?
What are the best practices to handle a hash file? What is the limitation in the size of the hash file in datastage 6.0?
Smile[:)] forever,
T.Vijay
Hash file corruption
Moderators: chulett, rschirm, roy
Hi,
Hash files are basically used for lookup reference to make the fetch fast. To get the maximum performace while creating the file only select the columns which are required. Hash files can get corrupted. But always make sure that you can re-create the hash file with your job re-running.
However the hash files has maximum size is 2 GB by default. To increase a hash table larger you need to create it with the 64bit option.
Thanks
Rasi
Hash files are basically used for lookup reference to make the fetch fast. To get the maximum performace while creating the file only select the columns which are required. Hash files can get corrupted. But always make sure that you can re-create the hash file with your job re-running.
However the hash files has maximum size is 2 GB by default. To increase a hash table larger you need to create it with the 64bit option.
Thanks
Rasi
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Since hashed files are implemented as operating system files, it is possible for operating system events to trash them. I once saw this occur when a disk "repair" tool decided to truncate the OVER.30 file (for some unreported reason that to this day remains a mystery).
It is also possible, as rasi noted, for a default hashed file to become corrupted by trying to extend it beyond 2GB.
There are a couple of extremely low probability events in the file manager for DataStage that can leave hashed files corrupted if power is lost during a write but, by and large, they are fairly robust. Well-tuned hashed files (with few or zero overflowed groups and, ideally, few or zero oversized records) are the least vulnerable.
Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
War does not decide who is right, it only decides who is left.
(Bertrand Russell)
I agree, having been in one.
(Ray Wurlod)
It is also possible, as rasi noted, for a default hashed file to become corrupted by trying to extend it beyond 2GB.
There are a couple of extremely low probability events in the file manager for DataStage that can leave hashed files corrupted if power is lost during a write but, by and large, they are fairly robust. Well-tuned hashed files (with few or zero overflowed groups and, ideally, few or zero oversized records) are the least vulnerable.
Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
War does not decide who is right, it only decides who is left.
(Bertrand Russell)
I agree, having been in one.
(Ray Wurlod)
Hello Folks:
If the hash file gets corrupted(partially or fully), will there be any error whenever we use that particualr hash file in an ETL job or we need to find it out after loading the junk data into the target stage?
BTW, thanks Rasi and Ray for your scintillating(as usual) replies.
Smile forever,
T.Vijay
If the hash file gets corrupted(partially or fully), will there be any error whenever we use that particualr hash file in an ETL job or we need to find it out after loading the junk data into the target stage?
BTW, thanks Rasi and Ray for your scintillating(as usual) replies.
Smile forever,
T.Vijay
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The usual error message is "unable to open", which will abort your job fairly quickly - certainly before any rows are processed.
Do you have any evidence for believing that a hashed file has become corrupted, or are you just gathering knowledge?
Sometimes, if an error occurs and a job aborts, there is information in the &PH& directory. This is usually loaded into the job log when the job is reset.
Do you have any evidence for believing that a hashed file has become corrupted, or are you just gathering knowledge?
Sometimes, if an error occurs and a job aborts, there is information in the &PH& directory. This is usually loaded into the job log when the job is reset.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: