Page 1 of 1

size of the logs affecting job performance

Posted: Fri Oct 20, 2006 10:44 am
by anton
before i go into more details, i just wanted to see if any of you guys know of any issues off the top of your head that stem from the size of the logs.

we keep a few days worth of logs around, and we noticed that as the size of the log grows, jobs take longer and longer to complete (+10-20 minutes, with the usual runtimes of 30-40 minutes).

we are talking about several thousand entries (sometimes up to 10K+), we use a sequencer to kickoff the jobs that in turn calls a custom basic routine that deals with parameters, then kicks off the job and checks its return status.

if the logs are reset or the job is recompiled right before the run, it often performs much faster. if left alone with every run it gets slower and slower.

nothing special, everything is pretty standard, ibm support could not find any relevant cases. just looking for some pointers.

thank you.

p.s. a bit unrelated, but sometimes when trying to check logs using dsjob from command line or director the client process will hang, while dsapi_slaves will hang around for days eating quite a lot of cpu, until we kill them. we even put a script together to monitor them.

Posted: Fri Oct 20, 2006 10:54 am
by kris007
It's true that the size of the logs affect the performance of job. It's always a good practice to clean the log entries regularly. We here, maintain logs for one week. Anything prior to that is auto-purged. There are quite a few posts discussed here regarding that which you can find by search.

Posted: Fri Oct 20, 2006 11:10 am
by chulett
As Kris notes - yes. This may sound obvious but we've also found that the disk you are installed on can make an enormous difference as well. Found out our initial setup put the DataStage directory on some old, slow SCSI drives that just couldn't keep up with all the requests they were getting. Migrating that to 'primary' enterprise type storage solved that particular problem.

Posted: Fri Oct 20, 2006 11:44 am
by kcbland
Active stages insert lines into the log when the stages start and finish. If the log is extremely large or poorly sized (they are dynamic hashed files afterall), then these messages can seriously take a long time to get added to the hashed file. The more active stages in the job, the longer it takes to get the job running and stopping, not to mention the startup and wrapup messages from the main job controller.

The point it, your project needs to be on an efficient file system, your logs have to be kept small, and you periodically MUST clear out the logs entirely.

Just purging is not sufficient. The files get upsized to a high watermark, only a CLEAR.FILE actually removes the storage information and sets the hashed file back to its initial size. Consider grabbing the utilities on my website members area to clear the logs via a job. Use that utility to truncate the logs, then the other utility to reset the log purg settings on the jobs afterwards.

Posted: Fri Oct 20, 2006 1:41 pm
by anton
kenneth, thank you. so you are saying that having a project-level purging setting and even clearing the logs in the director is not enough - i should do an extra step to clear them?

others: we have purging setup - we used to keep 3 days, but that was slowing things down. now if we purge the logs right before the run every night, things go smoothly. you comments make sense and are common sense and we were following all of these recommendations. what we are seeing however, is the fact that we are keeping several days of logs results in jobs running 10-20 minutes slower than usual. this is beyond "big logs cause some performance problems" common sense issue.

thank you for the input.

Posted: Fri Oct 20, 2006 2:26 pm
by kcbland
Dynamic hashed files have a modulus value that increases as the file grows. There's only two ways to change the modulus of a hashed file:

CLEAR.FILE: removes all data and resets the file to the minimum modulus

RESIZE.FILE: analyze the current file contents and adjust the modulus to an optimal setting

Purging deletes the contents, similar to a SQL delete statement. No space is reclaimed, the file stays the same size and the modulus doesn't adjust.

Adding rows to the file can potentially grow the file even larger, even if it has almost no "real" data, just empty space.

A manual purge to clear the file actually must issue the CLEAR.FILE statement and replace the purge settings row into the log file. Because you would have to do this by hand for all log files, I wrote utility jobs to help faciliate this.

Posted: Fri Oct 20, 2006 2:35 pm
by ray.wurlod
... up to version 7.5.2 at least.

Re: size of the logs affecting job performance

Posted: Tue Oct 24, 2006 8:53 am
by anton
thanks for your input guys. ibm acknowledged the problem and provided the patch.

in some cases log files get corrupted, then performance drops through the floor. this seems to be inherent to the deficiencies with universe db file hashing that stores the logs.

if this bites you, ask support for a patch.

thank you.