Page 1 of 1

Hash stage-Allow stage write cache

Posted: Fri Apr 08, 2005 12:11 am
by subramanya
Hi,
If we are loading the Hash file in the job itself(Both loading and reading are done), can we check the "Allow stage write cache"?. If not why?

Thnx
Subramanya

Posted: Fri Apr 08, 2005 12:36 am
by chulett
No, you can't... and it makes sense why if you stop and think about it for a minute. The hash file writes are cached in memory and (technically) the hash itself isn't really created and populated until the stage 'closes' and all of the information is flushed to disk.

Hash Stage closing

Posted: Fri Apr 08, 2005 12:57 am
by subramanya
I am loading and reading the hash file in the same job(with write cache enabled). While running the job only after the hash loading stage is completed, rows are coming from the source.So when the hash stage is finished will it not be flushed to the disk?

Posted: Fri Apr 08, 2005 1:01 am
by chulett
In that case, yes - you'll be fine. For whatever reason, you gave me the impression the writing and reading were happening simultaneously in the job, that is the situation in which you can't enable write caching.

Posted: Fri Apr 08, 2005 1:05 am
by dhiraj
Please give more information as to how you are loding your file and how you are reading it. Are you updating/inserting the file the same time you are reading it or is it a load first and then read it case.

In the latter case, you can use the Allow stage write cache option.

If you are concurrently both reading and writing to the file, even then you could enable the write cache option but you should also select the preload to memory with lock for updates option in the hash file stage where you are reading the file.

IHTH

Dhiraj

P.S. Please search the forum before you make a post. This question has been addressed here several times.

Posted: Fri Apr 08, 2005 1:05 am
by ArndW
subramanya,

if you are reading & writing different records then you can activate cacheing or delayed writes; but if you are updating data then you have no method of ensuring that one change doesn't overwrite the other. If your read stage reads record A with column Sum having value 2, 1 to it and subsequently writes it back. Then, several hundred rows later, you need to increment record A again and read it, you will get the original record because of the caching. When the job finishes, you will get either the 1st or the 2nd version of A in your hash file - and you have no control over which version gets written when (it is a function of the hash mechanism).