Performances Tuning Tips

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
eph
Premium Member
Premium Member
Posts: 110
Joined: Mon Oct 18, 2010 10:25 am

Performances Tuning Tips

Post by eph »

Hi all,

I just wonder if you have some DS tuning tips for a "special case".

My requirements are to be able to launch a lot of jobs in parallel (namely between 100 and 400), with little input for each. Each job will process 100 up to 1000 lines, either in sequential files, xml or database table.

Jobs are quite complex in terms of functional requirements, thus it leads to complex jobs with a lot of aggregations, sorts, lookups...

I understand that DS deals efficiently with high volumes in few jobs, but as I've seen, it has performances issues in dealing with many low volume jobs in parallel. I'm already going to modify the uvconfig T30FILE and MFILES since I got an error on the number of file descriptors opened when I run more than 50 jobs in parallel, even if the system seems to be able to handle at least 150.

Are there some OS/DataStage tuning/conf tips to help? Did you already faced this kind of requirements? Tell me if you want more input.

Thanks in advance,
Eric
jgreve
Premium Member
Premium Member
Posts: 107
Joined: Mon Sep 25, 2006 4:25 pm

Re: Performances Tuning Tips

Post by jgreve »

eph wrote:Jobs are quite complex in terms of functional requirements, thus it leads to complex jobs with a lot of aggregations, sorts, lookups...
That is an interesting problem; please post what you end up doing.

Suggestion: job start up times took a hit with version 8 when using dsjob or dsapi to do it; it has to fire up a JVM now, I suppose in order to send web-service requests to datastage.
I suspect you could shave off a significant part of your overhead by launching your jobs with one initial sequencer instead of calling dsjob like 400 times. (I'm planning on benchmarking that, but my research time has been sparse. *sigh* Anyway, that is just something I've been thinking about and wanted to put it out for your consideration.)


Questions:
What kind of run time do you want to realize for all jobs: 24 hours? 2 hours?
What are you running on, a single SMP machine? A Grid?
If SMP, how many cores? And what kind of configuration file(s)? (I'd expect all of the small jobs to run on single-nodes).
eph
Premium Member
Premium Member
Posts: 110
Joined: Mon Oct 18, 2010 10:25 am

Re: Performances Tuning Tips

Post by eph »

jgreve wrote:
eph wrote:Jobs are quite complex in terms of functional requirements, thus it leads to complex jobs with a lot of aggregations, sorts, lookups...
That is an interesting problem; please post what you end up doing.
I ended up by developing my jobs :lol:
Business requirements & the data integration platform's architecture make difficult reducing the overall job complexity. Right now, the platform is quite stable in terms of business requirements, so I don't want to refactor devs (little improvements against huge rework and business requirements stability issues).
jgreve wrote: Suggestion: job start up times took a hit with version 8 when using dsjob or dsapi to do it; it has to fire up a JVM now, I suppose in order to send web-service requests to datastage.
I suspect you could shave off a significant part of your overhead by launching your jobs with one initial sequencer instead of calling dsjob like 400 times. (I'm planning on benchmarking that, but my research time has been sparse. *sigh* Anyway, that is just something I've been thinking about and wanted to put it out for your consideration.)
That's interesting, I may consider to realize such sequencer with dynamic start of jobs against a pool of files to process, but I can already handle resources quotas using Control-M' scheduler (which is part of our architecture).
jgreve wrote: Questions:
What kind of run time do you want to realize for all jobs: 24 hours? 2 hours?
As far as I know, we won't have more than 2 hours to process everything.
jgreve wrote: What are you running on, a single SMP machine? A Grid?
If SMP, how many cores? And what kind of configuration file(s)? (I'd expect all of the small jobs to run on single-nodes).
We are on a SMP machine, 16 cores, 32Gio RAM. The default config file used on the project is a 2 nodes one. We've already test a 1 node configuration, without noticing a sensible difference (I guess we gained 1 to 5s on a 50 to 55s job processing time - startup is estimated to 35s).

The main concern is that we will have 400 jobs running with low number of lines, plus some (say 5 to 10) with higher volumes (1000+ lines). As far as I've seen, we didn't hit the cpu/RAM/io limits, but we're stucked to 50 parallel jobs due to bad MFILE,T30FILE and LOCKS configuration in DS (we got errors on the number of file descriptors that we could open). I'm trying to schedule tests with different values in order to tune those parameters.

If anyone had a previous experience with this kind of situation, any input is welcome :)

Regards,
Eric
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Have you looked into the Grid enablement Toolkit to push some of your jobs onto other servers?
eph
Premium Member
Premium Member
Posts: 110
Joined: Mon Oct 18, 2010 10:25 am

Post by eph »

Hi,

Unfortunately, we are in an awkward situation since our "grid" installation is a passive - active one (one server's up, in case of failure the second one take in hand the activity). So we can't use a second server to balance the load.

Anyway, bigger tests (on target type server this time) are planned tomorrow, I'll tell you if I get any better by modifying uvconfig parameters (essentially on the open file descriptor's limits).

Eric
eph
Premium Member
Premium Member
Posts: 110
Joined: Mon Oct 18, 2010 10:25 am

Post by eph »

Hi,

I couldn't modify & test the new parameters' values due to client's requests on the target environment. Anyway, as I've seen in IBM RedBook on DS' best practices, It is not a good idea to create a huge pool of files to process, as they suggest to process as much as possible in one shot...

I'll post here when I'll have more input.

Eric
pigpen
Participant
Posts: 38
Joined: Thu Jul 13, 2006 2:51 am

Post by pigpen »

may be you can try using server jobs for low volume data processing.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Tuning tips for running large numbers of parallel jobs in parallel? Don't.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
eph
Premium Member
Premium Member
Posts: 110
Joined: Mon Oct 18, 2010 10:25 am

Post by eph »

Hi all,

I finally was able to run some preliminary tests lately.

Here is the before/after :

Before
====
uvconfig file's parameters were:

MFILES = 2000
T30FILE = 65536
GLTABSZ = 50
RLTABSZ = 50
MAXRLOCK = 40

We couldn't handle more than 50 jobs running in parallel.

After
====
uvconfig file's parameters were:

MFILES= 2000
T30FILE= 200
GLTABSZ = 75
RLTABSZ = 75
MAXRLOCK = 74

We've been able to run 150/200 jobs in parallel (maybe a little more).

That's all for now with this small test of our new configuration. I'm will run some more precise tests this month and I'll post here my conclusions.

Eric
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

MAXRLOCK must always be (RLTABSZ - 1). That may have something to do with why the first uvconfig settings gave problems.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

If something must always be "X" I wonder why an option to set it to anything else would even exist? Was that not always the case?
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It was not always the case. Shared (READL) locks used to be implemented differently. Nowadays they are implemented as wait lists that queue in much the same way (and on the same number of semaphores) as the exclusive (READU) locks.

Go to the next Rocket Software U2 University to learn more.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
eph
Premium Member
Premium Member
Posts: 110
Joined: Mon Oct 18, 2010 10:25 am

Post by eph »

As a conclusion to this thread, I've decided to higher up the number of jobs that can be run in parallel from 40 to 70.
As mentioned, the new parameters have enable our DS installation to perform more jobs in parallel, but after a second test on a more complex job, we've reached the physical limits of the server around 100 jobs in parallel.


Eric
Post Reply