Page 1 of 1

Performances Tuning Tips

Posted: Tue Jun 07, 2011 4:17 am
by eph
Hi all,

I just wonder if you have some DS tuning tips for a "special case".

My requirements are to be able to launch a lot of jobs in parallel (namely between 100 and 400), with little input for each. Each job will process 100 up to 1000 lines, either in sequential files, xml or database table.

Jobs are quite complex in terms of functional requirements, thus it leads to complex jobs with a lot of aggregations, sorts, lookups...

I understand that DS deals efficiently with high volumes in few jobs, but as I've seen, it has performances issues in dealing with many low volume jobs in parallel. I'm already going to modify the uvconfig T30FILE and MFILES since I got an error on the number of file descriptors opened when I run more than 50 jobs in parallel, even if the system seems to be able to handle at least 150.

Are there some OS/DataStage tuning/conf tips to help? Did you already faced this kind of requirements? Tell me if you want more input.

Thanks in advance,
Eric

Re: Performances Tuning Tips

Posted: Tue Jun 07, 2011 11:38 am
by jgreve
eph wrote:Jobs are quite complex in terms of functional requirements, thus it leads to complex jobs with a lot of aggregations, sorts, lookups...
That is an interesting problem; please post what you end up doing.

Suggestion: job start up times took a hit with version 8 when using dsjob or dsapi to do it; it has to fire up a JVM now, I suppose in order to send web-service requests to datastage.
I suspect you could shave off a significant part of your overhead by launching your jobs with one initial sequencer instead of calling dsjob like 400 times. (I'm planning on benchmarking that, but my research time has been sparse. *sigh* Anyway, that is just something I've been thinking about and wanted to put it out for your consideration.)


Questions:
What kind of run time do you want to realize for all jobs: 24 hours? 2 hours?
What are you running on, a single SMP machine? A Grid?
If SMP, how many cores? And what kind of configuration file(s)? (I'd expect all of the small jobs to run on single-nodes).

Re: Performances Tuning Tips

Posted: Thu Jun 09, 2011 9:07 am
by eph
jgreve wrote:
eph wrote:Jobs are quite complex in terms of functional requirements, thus it leads to complex jobs with a lot of aggregations, sorts, lookups...
That is an interesting problem; please post what you end up doing.
I ended up by developing my jobs :lol:
Business requirements & the data integration platform's architecture make difficult reducing the overall job complexity. Right now, the platform is quite stable in terms of business requirements, so I don't want to refactor devs (little improvements against huge rework and business requirements stability issues).
jgreve wrote: Suggestion: job start up times took a hit with version 8 when using dsjob or dsapi to do it; it has to fire up a JVM now, I suppose in order to send web-service requests to datastage.
I suspect you could shave off a significant part of your overhead by launching your jobs with one initial sequencer instead of calling dsjob like 400 times. (I'm planning on benchmarking that, but my research time has been sparse. *sigh* Anyway, that is just something I've been thinking about and wanted to put it out for your consideration.)
That's interesting, I may consider to realize such sequencer with dynamic start of jobs against a pool of files to process, but I can already handle resources quotas using Control-M' scheduler (which is part of our architecture).
jgreve wrote: Questions:
What kind of run time do you want to realize for all jobs: 24 hours? 2 hours?
As far as I know, we won't have more than 2 hours to process everything.
jgreve wrote: What are you running on, a single SMP machine? A Grid?
If SMP, how many cores? And what kind of configuration file(s)? (I'd expect all of the small jobs to run on single-nodes).
We are on a SMP machine, 16 cores, 32Gio RAM. The default config file used on the project is a 2 nodes one. We've already test a 1 node configuration, without noticing a sensible difference (I guess we gained 1 to 5s on a 50 to 55s job processing time - startup is estimated to 35s).

The main concern is that we will have 400 jobs running with low number of lines, plus some (say 5 to 10) with higher volumes (1000+ lines). As far as I've seen, we didn't hit the cpu/RAM/io limits, but we're stucked to 50 parallel jobs due to bad MFILE,T30FILE and LOCKS configuration in DS (we got errors on the number of file descriptors that we could open). I'm trying to schedule tests with different values in order to tune those parameters.

If anyone had a previous experience with this kind of situation, any input is welcome :)

Regards,
Eric

Posted: Fri Jun 10, 2011 12:21 pm
by PaulVL
Have you looked into the Grid enablement Toolkit to push some of your jobs onto other servers?

Posted: Wed Jun 15, 2011 2:42 am
by eph
Hi,

Unfortunately, we are in an awkward situation since our "grid" installation is a passive - active one (one server's up, in case of failure the second one take in hand the activity). So we can't use a second server to balance the load.

Anyway, bigger tests (on target type server this time) are planned tomorrow, I'll tell you if I get any better by modifying uvconfig parameters (essentially on the open file descriptor's limits).

Eric

Posted: Tue Jun 21, 2011 7:41 am
by eph
Hi,

I couldn't modify & test the new parameters' values due to client's requests on the target environment. Anyway, as I've seen in IBM RedBook on DS' best practices, It is not a good idea to create a huge pool of files to process, as they suggest to process as much as possible in one shot...

I'll post here when I'll have more input.

Eric

Posted: Mon Jul 11, 2011 2:04 am
by pigpen
may be you can try using server jobs for low volume data processing.

Posted: Mon Jul 11, 2011 3:43 am
by ray.wurlod
Tuning tips for running large numbers of parallel jobs in parallel? Don't.

Posted: Tue Aug 02, 2011 9:48 am
by eph
Hi all,

I finally was able to run some preliminary tests lately.

Here is the before/after :

Before
====
uvconfig file's parameters were:

MFILES = 2000
T30FILE = 65536
GLTABSZ = 50
RLTABSZ = 50
MAXRLOCK = 40

We couldn't handle more than 50 jobs running in parallel.

After
====
uvconfig file's parameters were:

MFILES= 2000
T30FILE= 200
GLTABSZ = 75
RLTABSZ = 75
MAXRLOCK = 74

We've been able to run 150/200 jobs in parallel (maybe a little more).

That's all for now with this small test of our new configuration. I'm will run some more precise tests this month and I'll post here my conclusions.

Eric

Posted: Tue Aug 02, 2011 3:40 pm
by ray.wurlod
MAXRLOCK must always be (RLTABSZ - 1). That may have something to do with why the first uvconfig settings gave problems.

Posted: Tue Aug 02, 2011 6:22 pm
by chulett
If something must always be "X" I wonder why an option to set it to anything else would even exist? Was that not always the case?

Posted: Tue Aug 02, 2011 9:39 pm
by ray.wurlod
It was not always the case. Shared (READL) locks used to be implemented differently. Nowadays they are implemented as wait lists that queue in much the same way (and on the same number of semaphores) as the exclusive (READU) locks.

Go to the next Rocket Software U2 University to learn more.

Posted: Fri Sep 09, 2011 8:41 am
by eph
As a conclusion to this thread, I've decided to higher up the number of jobs that can be run in parallel from 40 to 70.
As mentioned, the new parameters have enable our DS installation to perform more jobs in parallel, but after a second test on a more complex job, we've reached the physical limits of the server around 100 jobs in parallel.


Eric