which purpose for scratch disk

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
sureshbabu
Participant
Posts: 11
Joined: Fri May 29, 2009 9:54 pm
Location: hyd

which purpose for scratch disk

Post by sureshbabu »

hi,

Could you please give some clarity on scratch disk, which purpose is it, and i know it is temporary memory.. it used for sort, aggregator, look, and
and please tell with stage using this disk which purpose ...........


thanks
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

Post by bcarlson »

Well, I think you have kind of answered your own question. It is temporary storage space for stages like sort, aggreator, etc. The scratch space is partitioned just like your datasets, so each node has space for work. Databases do teh same thing with temp space.

I think the engine uses this space to gather your data per node. Your input data is not sorted, so once it is partitioned, it is gathered here temporarily so that DS can sort it within your partition. Not sure about this, haven't really dug into the guts of how the engine works. I'll leave that to the gurus.... Ray? Craig?

Brad.
It is not that I am addicted to coffee, it's just that I need it to survive.
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

Post by bcarlson »

By the way, we have 2 sets of scratch disk defined on our system. The first is 100GB of RAM disk (i.e. it is memory mapped to a filesystem, so really fast). The 2nd is a regular 500GB filesystem. If the first fills up, jobs automatically start using the 2nd.

We saw upwards of a 50-60% effciency increase when jobs started using the RAM disk for scratch space as opposed to regular disk.

Brad.
It is not that I am addicted to coffee, it's just that I need it to survive.
Post Reply