Issues with Sequences

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Pete Morris
Charter Member
Charter Member
Posts: 39
Joined: Wed Jun 23, 2004 4:33 am
Location: UK, chester

Issues with Sequences

Post by Pete Morris »

Are there any contraints on the number of parallel jobs that can be invoked from a sequence. We seem to get job failures when more than 20 jobs are invoked in parallel from a sequence.
Pete Morris
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

The limitation is not in the sequence itself, but might be related to the amount of resources that 20 concurrent PX jobs might use on your system. What sort of job failure causes do you get?
Kirtikumar
Participant
Posts: 437
Joined: Fri Oct 15, 2004 6:13 am
Location: Pune, India

Post by Kirtikumar »

The constraints are forced by resources available on your machine.
What kind of error messages are reported?
Regards,
S. Kirtikumar.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Timeouts and the dreaded -14 error trying to start jobs are signs of a resource constrained system...
-craig

"You can never have too many knives" -- Logan Nine Fingers
Pete Morris
Charter Member
Charter Member
Posts: 39
Joined: Wed Jun 23, 2004 4:33 am
Location: UK, chester

Post by Pete Morris »

My understanding of unix is that there is no limit to how many processes can be started at any one time as unix will then share system resource between them all. Therefore i am confused as to why DS can have trouble starting jobs.
Pete Morris
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

There is a limit. This limit and constraint can be better explained by a unix admin to you. They have to tune the kernal appropriately to support massive simultaneous processing. And if you push to much, you can reach resource limitations, beyond the kernal tuning.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Pete Morris wrote:...no limit to how many processes can be started at any one time as unix will then share system resource between them all...
Even though a given UNIX machine might be nominally able to start 2^^8 or 2^^16 PIDs according to it's configuration doesn't mean that it can.
All systems have a memory limit. This is usually the sum of physical memory in the machine and some amount of other, usually disk, storage reserved to hold swapped out memory. Each PID uses a bit of shared memory with others but also has it's own private memory space. If we were to assume that each process uses 1Mb of memory (a very conservative number considering what they will be doing) then a machine that has 512Mb of main memory and another 512Mb of swap space could run 1000 processes (assuming the operating system itself doesn't use any of this).

So by no means is the number of processes on any system unlimited, even if these are doing nothing at all.

Taking this just one step further, each process needs to consume CPU. A simple model for the 1000 process system theorized above would have each one use 1/1000 of the available CPU. But there are also a number of UNIX or other OS processes that need their share. But since chances are that the process' memory has been pushed out to disk (since system memory is full) the OS needs CPU cycles to locate some other process that it can swap out, then copy that processes' settings to swap and then read the current one back into memory. All of this takes so long (more than 1/1000 of the available CPU) that by the time a process gets to be executed it is already time for it to be pushed out in favor of another. This escalates until the system is thrashing and effectively spending 100% of it's time maintaining itself - sort of like big government :)

To bring this back to your probablye problem - if the system starts so many active processes that contend for scarce resources (CPU,I/O,Memory, Database locks, etc.) at the same time it slows down dramatically; and there are hard-coded timers in DataStage as well as other applications that come into effect when the machine is that slow.
Pete Morris
Charter Member
Charter Member
Posts: 39
Joined: Wed Jun 23, 2004 4:33 am
Location: UK, chester

Post by Pete Morris »

What are the timeout mechanisms and is there a way to overide them.
Pete Morris
Pete Morris
Charter Member
Charter Member
Posts: 39
Joined: Wed Jun 23, 2004 4:33 am
Location: UK, chester

Post by Pete Morris »

What are the timeout mechanisms and is there a way to overide them.
Pete Morris
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

This thread doesn't really point out any timeouts directly. There are some hard-coded values in DataStage that the engineers put in, thinking that no normal system could hit them during processing. As we all know, systems are often used in ways that designers and engineers don't consider and thus what seems to be a reasonable 60-second maximum wait time for a process to send back it's "I'm alive" message {which usually comes back in milliseconds} is no longer sufficient. This can be the case when starting up jobs and can trigger the -14 timeout message; IBM recently brought out a workaround (I wouldn't call this a bug, so the solution reallly isn't a "patch" but an enhancement) to avoid this timeout. But I wouldn't recommend putting in changes like this - it is much more important to try to avoid such timeouts, either by redesign or perhaps through hardware reconfiguration or upgrades, since if these limits are reached the system is probably so overloaded that response times will be abyssmal and system overhead will take up more cycles than actual processing.

The most common timeout is with IPC; and these values can be changed by us. It almost never makes any sense to change the actual buffer sizes and I've rarely seen cases where there is a valid reason to increase the default timeout defaults significantly.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

When I was learning operating system (PRIMOS) tuning in nineteen mumble mumble I was told that the optimum point is "just before the machine starts thrashing". That, of course, is a movable target, but the method was usually ramp up the parameter in question untill thrashing began, then back off a bit. This was done under (perhaps simulated) "normal" or "heavy" load conditions.

Note how accurately this is quantified. Not.

You might ask - demand to know - how long it will take to drive from point A to point B. The answer will depend on many factors, over only some of which you have any control. And the answer may be different at different times.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

You remember stuff from nineteen mumble mumble :shock:
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That's one of the secrets of my success. :wink:

The Three Secrets of Success
1. see above
2. don't tell them everything you know
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

And the third one is not for all, right :?:
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Post Reply