Reducing Swap Usage in Simple Jobs

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dougcl
Premium Member
Premium Member
Posts: 137
Joined: Thu Jun 24, 2010 4:28 pm

Reducing Swap Usage in Simple Jobs

Post by dougcl »

Hi folks, I have a bunch of very simple extract jobs. OracleConnector->Sort->DataSet.

With 16 partitions, I am finding that each one of these jobs requires about 3.2GB of swap space (based on using vmstat while the jobs are running).

At this rate I consume all 30GB of my "available" swap after about 7 jobs. Are there parameters that I can tune to reduce the footprint of each of these jobs? I found the memory restriction in the sort stage, and that helps a little.

Should I merely look at increasing the swap? 3.2GB per job seems huge.

Thanks,
Doug
Last edited by dougcl on Fri Nov 12, 2010 5:09 pm, edited 2 times in total.
dougcl
Premium Member
Premium Member
Posts: 137
Joined: Thu Jun 24, 2010 4:28 pm

Post by dougcl »

Hi folks, any idea why simple jobs of the form

Oracle Connector->Sort->Dataset

consume 3.2GB of swap space per instance, independent of the amount of data they are transferring?

Perhaps more simply, are there parameters I can tune (like limit MB in the sort stage) to reduce this footprint?

Thanks,
Doug
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You can certainly tune memory in the Sort stage, up or down, but if the system is already swapping heavily I'd be more inclined to figure out what's causing that. Is it genuine swap (to system swap space) or is it usage of the scratchdisk resources mentioned in your configuration file? If the latter, do the file names indicate the culprit(s)? Can your monitoring of the system identify the culprit processes?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
dougcl
Premium Member
Premium Member
Posts: 137
Joined: Thu Jun 24, 2010 4:28 pm

Post by dougcl »

ray.wurlod wrote:You can certainly tune memory in the Sort stage, up or down, but if the system is already swapping heavily I'd be more inclined to figure out what's causing that. Is it genuine swap (to system swap space) or is it usage of the scratchdisk resources mentioned in your configuration file? If the latter, do the file names indicate the culprit(s)? Can your monitoring of the system identify the culprit processes?
Hi Ray, I wish it were the scratch space, because I've learned how to control that. This is system memory. We only have 16GB, and 30GB of swap. I'm swapping because I am running ten jobs at once and each one requires (apparently) 3.2GB. This seems like a lot of memory for each job, given that they each contain only three stages. It almost acts like a fixed allocation.

I suppose I could start with a trivial job of some kind, and work my way up and see which part of the job is grabbing the memory.
mhester
Participant
Posts: 622
Joined: Tue Mar 04, 2003 5:26 am
Location: Phoenix, AZ
Contact:

Post by mhester »

Doug,

Swap will be used when a page in memory needs to be freed. Normal IBM recommendation for swap is 8gb * number of CPU's.

Based on your configuration which I assume is 16 logical nodes (based on your 16 partitions) and at least 3 players per job (possibly more) plus one section leader for each node and one conductor (verify with your job score).

So -

16 nodes * 4 (let's say 4 players per job) + 16 section leaders and 1 conductor = 81 players/unix processes per job * 10 jobs = 810 unix processes.

Each of these processes wants a piece of the pie when they start and on Linux and most other flavors of Unix I have seen that most grab about 30mb memory.

Swap, as stated above, goes hand-in-hand with memory. If memory usage is high then I believe you would begin to experience a situation where swapping will take place which prompts me to ask the question - How much memory do you have on this system?

Your seemingly "simple" jobs may not be so kind to the environment depending on data volumes, type of sort etc... and how much swap and memory you currently have.

I would not recommend at all that you start changing parallel tuneables as changing one may impact or defeat another and cause other issues down stream that may be worse then the problem you first had.

Remember that your 800+ processes are not the only processes competing for system resources.
dougcl
Premium Member
Premium Member
Posts: 137
Joined: Thu Jun 24, 2010 4:28 pm

Post by dougcl »

mhester wrote:Doug,
GREAT POST. Thanks a lot.
dougcl
Premium Member
Premium Member
Posts: 137
Joined: Thu Jun 24, 2010 4:28 pm

Post by dougcl »

Hi folks, is there a way to reduce the 30MB used per process? This seems like a huge amount of memory. It would be nice to get this down to 10K or so.

Thanks,
Doug
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Was it not Bill Gates who asserted that you should never need more than 640KB of memory?
:roll:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
dougcl
Premium Member
Premium Member
Posts: 137
Joined: Thu Jun 24, 2010 4:28 pm

Post by dougcl »

Hi folks this appears to be a Solaris thing. IBM recommends doubling memory estimates for the Solaris platform.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Except that I know the support people to be honest about these things I might cynically suggest that IBM's next recommendation will be to purchase pSeries machines from them!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply