Physical size of dynamic hash file

Suttond · Post by **Suttond** » Fri May 26, 2006 11:13 am

Having written 655,000 rows of a single vchar field of 10bytes to hash file, I believe that the max size would be (10+2)*655,000 = 1.31Mb. The total size of the hash file (data+over) is 21.6MB. Can anyone explain this.

chulett · Post by **chulett** » Fri May 26, 2006 11:15 am

What was your 'minimum modulus' setting in the stage?

kris007 · Post by **kris007** » Fri May 26, 2006 11:52 am

Hashed files are BINARY files that pre-allocate space and stores data using a placement algorithm within groups. When groups get full, they spill into overflow space. Dynamic files grow and shuffle the data when they hit predetermined limits.

And hence the answer depends on what you have set the minimum modulus to as Craig has mentioned.

Suttond · Post by **Suttond** » Fri May 26, 2006 12:45 pm

the minimum modulus was left at default of 1.

ray.wurlod · Post by **ray.wurlod** » Fri May 26, 2006 5:02 pm

Every record has a minimum 13 byte storage overhead, usually larger.

There are forward and backward pointers, each 32 bits or 64 bits, and another "flag word" of the same size. There is a single byte between the key and data (a "segment mark"), a single byte between each field (a "field mark"), and the whole thing is padded to a multiple of 32 bits or 64 bits.

Every group ("page") is padded out to a multiple of (GROUP.SIZE * 2KB). This probably accounts for most of the "discrepancy" that you reported.

If a record is oversized (larger than specified by LARGE.RECORD) then an additional two pointers are created, and extra pages are created in the OVER.30 file to store that record's data.

If there are overflowed groups, these will also generate extra pages in the OVER.30 file. If the hashed file was created as a UniVerse table, then there will be an extra page (the "SICA block") in OVER.30.

The DATA.30 file will always be (GROUP.SIZE + 1) * current_modulus in size