Page 1 of 1

Estimating memory usage requirement of a DS Job

Posted: Thu Jun 15, 2006 12:26 am
by newsaur
Is there any systematic way to estimate the memory usage requirement of a particular DS Job?

Thanks.

Posted: Thu Jun 15, 2006 12:42 am
by ArndW
There is no simple method to do this. If you start off with a DUMP_SCORE value you will see how the compiler has put the runtime objects together; you will see how many actual processes are fired off and which stages have been "combined" out of the executable. Then you will have to look at each stage and, depending upon the stage type, estimate the memory usage and how much of that might be shared with other processes. Then you also need to look at any implicit or explicit sorting that might be going on and add that to the equation. Add in the buffering settings for most stages and you would get a rough idea. These numbers will change depending upon several factors such as data volumes, join types, sorts done, repartitioning required and even on relative stage speeds.

That is just for estimating memory usage, if you wish to estimate disk usage then the process needs to be repeated to get those values as well.

It is easier to monitor a job once finished to get its actual usage.

Posted: Thu Jun 15, 2006 2:08 am
by ray.wurlod
There are more facilities to be available in the next ("Hawk") release to allow for resource monitoring and estimation.