Hi,
If we are loading the Hash file in the job itself(Both loading and reading are done), can we check the "Allow stage write cache"?. If not why?
Thnx
Subramanya
Hash stage-Allow stage write cache
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 22
- Joined: Fri Oct 15, 2004 11:53 pm
- Location: Bangalore, India
- Contact:
No, you can't... and it makes sense why if you stop and think about it for a minute. The hash file writes are cached in memory and (technically) the hash itself isn't really created and populated until the stage 'closes' and all of the information is flushed to disk.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 22
- Joined: Fri Oct 15, 2004 11:53 pm
- Location: Bangalore, India
- Contact:
Hash Stage closing
I am loading and reading the hash file in the same job(with write cache enabled). While running the job only after the hash loading stage is completed, rows are coming from the source.So when the hash stage is finished will it not be flushed to the disk?
Please give more information as to how you are loding your file and how you are reading it. Are you updating/inserting the file the same time you are reading it or is it a load first and then read it case.
In the latter case, you can use the Allow stage write cache option.
If you are concurrently both reading and writing to the file, even then you could enable the write cache option but you should also select the preload to memory with lock for updates option in the hash file stage where you are reading the file.
IHTH
Dhiraj
P.S. Please search the forum before you make a post. This question has been addressed here several times.
In the latter case, you can use the Allow stage write cache option.
If you are concurrently both reading and writing to the file, even then you could enable the write cache option but you should also select the preload to memory with lock for updates option in the hash file stage where you are reading the file.
IHTH
Dhiraj
P.S. Please search the forum before you make a post. This question has been addressed here several times.
subramanya,
if you are reading & writing different records then you can activate cacheing or delayed writes; but if you are updating data then you have no method of ensuring that one change doesn't overwrite the other. If your read stage reads record A with column Sum having value 2, 1 to it and subsequently writes it back. Then, several hundred rows later, you need to increment record A again and read it, you will get the original record because of the caching. When the job finishes, you will get either the 1st or the 2nd version of A in your hash file - and you have no control over which version gets written when (it is a function of the hash mechanism).
if you are reading & writing different records then you can activate cacheing or delayed writes; but if you are updating data then you have no method of ensuring that one change doesn't overwrite the other. If your read stage reads record A with column Sum having value 2, 1 to it and subsequently writes it back. Then, several hundred rows later, you need to increment record A again and read it, you will get the original record because of the caching. When the job finishes, you will get either the 1st or the 2nd version of A in your hash file - and you have no control over which version gets written when (it is a function of the hash mechanism).
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>