Page 1 of 1

which purpose for scratch disk

Posted: Fri Dec 31, 2010 12:55 am
by sureshbabu
hi,

Could you please give some clarity on scratch disk, which purpose is it, and i know it is temporary memory.. it used for sort, aggregator, look, and
and please tell with stage using this disk which purpose ...........


thanks

Posted: Tue Jan 04, 2011 4:17 pm
by bcarlson
Well, I think you have kind of answered your own question. It is temporary storage space for stages like sort, aggreator, etc. The scratch space is partitioned just like your datasets, so each node has space for work. Databases do teh same thing with temp space.

I think the engine uses this space to gather your data per node. Your input data is not sorted, so once it is partitioned, it is gathered here temporarily so that DS can sort it within your partition. Not sure about this, haven't really dug into the guts of how the engine works. I'll leave that to the gurus.... Ray? Craig?

Brad.

Posted: Tue Jan 04, 2011 4:19 pm
by bcarlson
By the way, we have 2 sets of scratch disk defined on our system. The first is 100GB of RAM disk (i.e. it is memory mapped to a filesystem, so really fast). The 2nd is a regular 500GB filesystem. If the first fills up, jobs automatically start using the 2nd.

We saw upwards of a 50-60% effciency increase when jobs started using the RAM disk for scratch space as opposed to regular disk.

Brad.