Hi All,
How does a hash file handle duplicate records. My understanding is that if a hash file has a key, then a record coming in with the same key overwrites the existing record.
If it doesn't have a key, then it keeps the incoming record even though it is a duplicate.
But does it make sense to store information in a hash file with records that don't have a key(or keys).
Thanks,
Naveen.
Duplicates in Hash File
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 1255
- Joined: Wed Feb 02, 2005 11:54 am
- Location: United States of America
-
- Charter Member
- Posts: 199
- Joined: Tue Jan 18, 2005 2:50 am
- Location: India
Re: Duplicates in Hash File
naveendronavalli wrote:Hi All,
How does a hash file handle duplicate records. My understanding is that if a hash file has a key, then a record coming in with the same key overwrites the existing record.
If it doesn't have a key, then it keeps the incoming record even though it is a duplicate.
But does it make sense to store information in a hash file with records that don't have a key(or keys).
Thanks,
Naveen.
Your first assumtion is correct. If a record comes and that key is already present in the hash file, the old record gets overwritten.
But your second assumption is wrong.
Few details about Hash file, related to your assumptions :
1. Hash File is nothing but indexed base storage of data.
2. For creating an index you need to have key columns. So for creating hash file through DS, you will have to define at least one key column, otherwise your job will give compilation error.
3. For creating indexes, there are different hashing algorithm.
Shantanu Choudhary
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
You have missed the vital point. There is NO index on the key. That's where the speed comes from; the key value is processed by a function (the "hashing algorithm") that returns the exact address of the page on which the record resides. Exactly one logical I/O is required to retrieve the record (unless the record is oversized or its group is overflowed).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 1255
- Joined: Wed Feb 02, 2005 11:54 am
- Location: United States of America