I'd like to write to a single hash file from multile concurrent jobs. Previously I've written to a hash file using multiple inputs within a job, however, is writing to a hash file from many jobs supported?
Thanks,
Peter
Hash File Concurrency
Moderators: chulett, rschirm, roy
Actually, as long as the jobs are not trying to clear the hash file while others are writing to it, you're fine to have multiple jobs streaming output to the same hash file.
You'll have to be aware that the same row being written to the file by different jobs will not be the best design, because you will have difficulty ensuring the correct version of the row is the last one written. If you're trying to read and write to the hash file, expecting coordination between jobs, good luck. Also, if you're using read and write caching, you'll have an even more difficult time.
If all you're doing is having multiple jobs dump data into the hash file and don't worry about the same row coming from different jobs, you're fine. This is a common practice. What you will find is dimishing return, because the more jobs writing to the hash file causes the file to grow really fast. The constant resizing degrades performance on all of the jobs, so it's best to set the initial modulo high enough so that the file doesn't resize a lot.
Also, write delay cache will deceive you in that the jobs scream along and then stall on the last row as the cache writes. When you have a bunch of jobs finish at the same time, they all congest as they fight to purge their cache to the file.
You'll have to be aware that the same row being written to the file by different jobs will not be the best design, because you will have difficulty ensuring the correct version of the row is the last one written. If you're trying to read and write to the hash file, expecting coordination between jobs, good luck. Also, if you're using read and write caching, you'll have an even more difficult time.
If all you're doing is having multiple jobs dump data into the hash file and don't worry about the same row coming from different jobs, you're fine. This is a common practice. What you will find is dimishing return, because the more jobs writing to the hash file causes the file to grow really fast. The constant resizing degrades performance on all of the jobs, so it's best to set the initial modulo high enough so that the file doesn't resize a lot.
Also, write delay cache will deceive you in that the jobs scream along and then stall on the last row as the cache writes. When you have a bunch of jobs finish at the same time, they all congest as they fight to purge their cache to the file.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle