Page 1 of 1

Different types of Hash files

Posted: Fri Jul 15, 2005 6:16 am
by Gokul
Hi,

I am in search of answers for the following questions:

1. What are the different types of Hash files.
2. Under what conditions we should used the specific type.
3. Performance difference in the types of hash file

Thanks,
Gokul

Posted: Fri Jul 15, 2005 6:25 am
by Viswanath
Hi Gokul,

I havent worked on the Unix version of DS, but I bet you would get loads of stuff on Hash files if you search the forum. I am not sure if there is any difference between Unix and Windows hash files.

Cheers,
Vishy

Posted: Fri Jul 15, 2005 6:33 am
by ArndW
Gokul,

a quick search for answers to this question showed more responses than I can page on mby browser; I suggest you look there for some information.

The Hash file types are described in the UniVerse documentation, specifically in the UniVerse System Description 9.6.pdf downloadable from IBM at Universe Documentation

Posted: Fri Jul 15, 2005 7:09 pm
by ray.wurlod
Except for the pathnames there are no differences between hashed files on UNIX and on Windows.

The internal byte order may differ but that's governed by the type of CPU chip rather than by the operating system. There are, for example, some UNIX variants that run on Intel chips. In any case, the internal byte order is invisible to users.

Posted: Fri Jul 15, 2005 7:19 pm
by ray.wurlod
1. What are the different types of Hashed files?
Static (the number of groups is pre-set and does not change except through intervention) and dynamic (the number of groups can change dependent on the volume of data stored). For more information, search the forum. There are seventeen "types" of static hashed file, but these simply represent static hashed files with different hashing algorithms.

2. Under what conditions we should used the specific type?
To get started use the default type (dynamic) because it's easier. A perfectly tuned static hashed file will populate faster than a default dynamic hashed file because extra work is needed to "grow" the latter. In use for lookups they should work identically (see below), but we don't live in a perfect world where data are distributed ideally.

3. Performance difference in the types of hashed file.
Define "performance" here. Hashed files work by using the primary key value to calculate the address of the group (page) containing the record. Therefore, in a perfectly tuned hashed file, irrespective of type, a hashed file requires exactly one logical I/O operation to access a record. The different types and configuration settings thereof allow you to get as close as possible to "perfectly tuned" (essentially no overflowed groups).

Static hashed files need more regular maintenance (to resize them correctly) than dynamic hashed files (which resize themselves). Time must be allocated for analysis and implementation.

Note that the name is hashed file, not hash file. This refers to the fact that they use a hashing algorithm to determine the key's location among a finite number of groups.

There is a large body of work published in the past forty years about tuning hashed files; some of it is good, some of it isn't.

Posted: Sun Jul 17, 2005 3:39 am
by Gokul
Thanks Ray.

Posted: Sun Jul 17, 2005 11:59 pm
by ray.wurlod
Just out of curiosity, were these interview questions?

Posted: Mon Jul 18, 2005 1:10 am
by Gokul
Ray,
These were not interview questions.
Just trying to get more out of you :wink:


Gokul