Dynamic Hashed File Overflow

shaimil · Post by **shaimil** » Wed May 19, 2004 7:04 am

Can someone please clear up the following queries on overflow in a dynamic hashed file for me.

1. When does a record get written to OVER.30 as opposed to the hashed file adding another group and resizing?

2. How are records stored in the OVER.30. Are they in groups or just a single group where all overflow records are kept?

3. How are records larger than the group size or seperation stored in both STATIC and DYNAMIC files?

4. How are large records, as specified by the LARGE RECORD SIZE parameter, in DYNAMIC files stored? Are they kept in overflow?

Thanks in advance
Shay

kcbland · Post by **kcbland** » Wed May 19, 2004 7:18 am

Check this out:
viewtopic.php?t=85364

shaimil · Post by **shaimil** » Wed May 19, 2004 10:52 am

Thanks for that. It answers my first 2 queries but not the last 2. Any ideas?

3. How are records larger than the group size or seperation stored in both STATIC and DYNAMIC files?

4. How are large records, as specified by the LARGE RECORD SIZE parameter, in DYNAMIC files stored? Are they kept in overflow?

kduke · Post by **kduke** » Wed May 19, 2004 2:41 pm

If a record is larger than the group size of the hash file then it has to go into a overflow group. I am not sure but I think it has something on the end of each group that tells it the record is continuing. The size of a record is sort of stored by the @ID in the hash file. It functions more as a linked list.

If records are huge then use a type 19 file. A hash file gets extremely slow when overflow records are continuely accessed. LARGE RECORD size can help performance a lot. I would go to a static hash file at this point if it was me.

To figure it out then create a type 18 file with a modulo of 3 and do:

od -doc FILE

at the UNIX level. Keep adding records to it and you can see how over flow works very quickly.

"od" is the UNIX octal dump command.
-d decimal
-o octal
-c characters are displayed.

ray.wurlod · Post by **ray.wurlod** » Wed May 19, 2004 4:09 pm

When hashed files are empty each group is in one "buffer", the size of which is decreed by the separation parameter (static hashed files) or the GROUP.SIZE parameter (dynamic hashed files).

When a group overflows, a secondary group buffer (the same size) is used. In static hashed files this is appended to the file structure unless there is a free secondary buffer already within the file structure. In dynamic hashed files it is appended to the OVER.30 file (because the number of primary group buffers in DATA.30 may have to change) unless there is a free secondary buffer already within OVER.30.

Large records (larger than buffer size in static hashed files, larger than LARGE.RECORD parameter in dynamic hashed file) have an extended header (two extra 32-bit or 64-bit pointers) to a daisy chain of secondary buffers in which data from the record are stored. The key to the record is always stored in the regular group buffer.

Only that part of a large record actually stored in the regular group buffer contributes to the "actual load" figure used in determining whether the dynamic hashed file needs to split (add a group buffer to DATA.30).

There is no real difference in storage between static and dynamic hashed files. In dynamic hashed files the secondary buffers, whether they are used for overflowed groups, oversized records, SICA block or partfile block, are all kept in OVER.30 because DATA.30 (which only contains primary group buffers) needs to grow and shrink.

I anticipate publishing a paper on hashed files in the short to medium term, provided certain legal obstacles can be overcome.