Hash File Build

marc_brown98 · Post by **marc_brown98** » Tue Jun 01, 2004 10:35 am

I have a server job that is trying to build a HF for lookup purposes. It seems to slow down immensely as it progresses. The source file has approx. 1.2 mil records in it. I am using the type 30 dynamic hash file and the records being returned are 3 fields, 2 keys and the lookup value. The job starts out very fast but after 400k records it begins to slow and progressively slows to less than 700 records per sec. Any suggestions?

Thanks
Marc

kcbland · Post by **kcbland** » Tue Jun 01, 2004 10:51 am

Up the minimum modulus from 1. What you're probably seeing is the resize by doubling effect degrade performance. Checkout this post as well:
viewtopic.php?t=85364

marc_brown98 · Post by **marc_brown98** » Tue Jun 01, 2004 11:31 am

Thanks Kenneth!

ray.wurlod · Post by **ray.wurlod** » Wed Jun 02, 2004 4:30 am

Dynamic hashed files don't resize by doubling, they resize by adding one group (logically - probably about eight group buffers at a time physically). This actually gives you more pain, as you're taking the hit of restructuring every N records loaded, where N is the number of records per group.

As Ken says, if you create the hashed file with its minimum modulus set to approximately what you'll need at the end, you take the hit of allocating this disk space up front, so that the load should proceed more quickly and at non-diminishing rate.

Do you use write caching? This, too, can help load performance.

Another possibility, if your hashed file is large, is to use a static hashed file. This is one where the disk space is necessarily pre-allocated, and you get more control over the size of groups and the hashing algorithm used. Empirical evidence suggests that these perform slightly better than the equivalent dynamic hashed file for larger sizes; the downsize is that they require more calculation and more maintenance.

marc_brown98 · Post by **marc_brown98** » Wed Jun 02, 2004 7:39 am

Ray,
Thanks for your input. I do not consider this hash file to be very large, around 65 Mb, approx 1.2 mil. records. I will try to use the write caching, right now, it takes around 30 minutes of wall clock time to build this.

marc_brown98 · Post by **marc_brown98** » Wed Jun 02, 2004 8:46 am

Ray & Kenneth,
Thanks much for the help, the suggestions pointed me in the right direction, build time is less than 2 minutes.

Cheers