Page 1 of 1

Hashed files - delete recs vs reload

Posted: Thu Mar 04, 2010 9:21 am
by ASU_ETL_DEV
Hello,
If I delete records from a hashed file does that significantly influence the file's performance? Is it better to unload/reload the entire file, filtering out the records I do not need?
Thanks

Posted: Thu Mar 04, 2010 9:28 am
by ArndW
While pure inserts are more efficient than delete/update on an existing table, the performance difference on a well hashed file (dynamic files are well-hashed) isn't great.

Posted: Thu Mar 04, 2010 9:36 am
by ASU_ETL_DEV
If I keep deleteing records daily, should I expect over time a degradation of performance that will eventually require a reload? I am trying to asses the maintenance impact of one method versus the other.
Thanks

Posted: Thu Mar 04, 2010 9:51 am
by ArndW
Hashed files don't suffer from fragmentation in that manner; but the default settings of 20% merge 80% split can cause overhead if you go past those limits. The hashed file uses a hashing algorithm on the key to decide into which group a record is placed, the number of groups is also seen as the MODulo of a file. Within these groups you have a linked list of values, so removing an element is not a big issue.

Posted: Thu Mar 04, 2010 10:09 am
by ASU_ETL_DEV
All right, thank you.

Posted: Thu Mar 04, 2010 2:52 pm
by ray.wurlod
There's more. Hashed files automatically re-use space freed by deleting records.

Posted: Thu Mar 04, 2010 3:01 pm
by chulett
So... deleting from hashed files... we're talking UV stage wearing, SQL deletes here I assume?

Posted: Thu Mar 04, 2010 3:03 pm
by ASU_ETL_DEV
Yes, SQL delete in a UV stage versus a reload of the file.