Physical size of dynamic hash file
Moderators: chulett, rschirm, roy
Physical size of dynamic hash file
Having written 655,000 rows of a single vchar field of 10bytes to hash file, I believe that the max size would be (10+2)*655,000 = 1.31Mb. The total size of the hash file (data+over) is 21.6MB. Can anyone explain this.
Hashed files are BINARY files that pre-allocate space and stores data using a placement algorithm within groups. When groups get full, they spill into overflow space. Dynamic files grow and shuffle the data when they hit predetermined limits.
And hence the answer depends on what you have set the minimum modulus to as Craig has mentioned.
And hence the answer depends on what you have set the minimum modulus to as Craig has mentioned.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Every record has a minimum 13 byte storage overhead, usually larger.
There are forward and backward pointers, each 32 bits or 64 bits, and another "flag word" of the same size. There is a single byte between the key and data (a "segment mark"), a single byte between each field (a "field mark"), and the whole thing is padded to a multiple of 32 bits or 64 bits.
Every group ("page") is padded out to a multiple of (GROUP.SIZE * 2KB). This probably accounts for most of the "discrepancy" that you reported.
If a record is oversized (larger than specified by LARGE.RECORD) then an additional two pointers are created, and extra pages are created in the OVER.30 file to store that record's data.
If there are overflowed groups, these will also generate extra pages in the OVER.30 file. If the hashed file was created as a UniVerse table, then there will be an extra page (the "SICA block") in OVER.30.
The DATA.30 file will always be (GROUP.SIZE + 1) * current_modulus in size
There are forward and backward pointers, each 32 bits or 64 bits, and another "flag word" of the same size. There is a single byte between the key and data (a "segment mark"), a single byte between each field (a "field mark"), and the whole thing is padded to a multiple of 32 bits or 64 bits.
Every group ("page") is padded out to a multiple of (GROUP.SIZE * 2KB). This probably accounts for most of the "discrepancy" that you reported.
If a record is oversized (larger than specified by LARGE.RECORD) then an additional two pointers are created, and extra pages are created in the OVER.30 file to store that record's data.
If there are overflowed groups, these will also generate extra pages in the OVER.30 file. If the hashed file was created as a UniVerse table, then there will be an extra page (the "SICA block") in OVER.30.
The DATA.30 file will always be (GROUP.SIZE + 1) * current_modulus in size
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.