Sequence Error

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
novice_pgr
Participant
Posts: 8
Joined: Thu Jan 19, 2006 11:45 pm

Sequence Error

Post by novice_pgr »

Hi All

<Seq_Name>.JobControl
(@JAC_SEPCS_MDSNG_NAMPLT);controller problem; Error calling
DSRunJob(<jobname>,code=-14
(Timed out while waiting for an event)


Anyone faced this before ?
suggestion to resolve this ?
Nageshsunkoji
Participant
Posts: 222
Joined: Tue Aug 30, 2005 2:07 am
Location: pune
Contact:

Post by Nageshsunkoji »

Hi,

I think one of your job was not in a runnable state, means one of your your job is not in Compile position,either it is in Abort position or Not compiled position. So , check the position of your jobs if its not in compiled position compile it.

Regards
Nagesh.
NageshSunkoji

If you know anything SHARE it.............
If you Don't know anything LEARN it...............
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

No, that's not the problem. If you search the forum for either the error in question or the phrase 'timed out while waiting for event' you'll see it's a resource issue. You are asking too much of your system, spooling up more jobs that it can handle.

There is a hard coded value deep in the engine of 60 or 90 seconds (I do believe) as the 'timeout' value. When it attempts to start a job and it takes longer than that before it comes back and acknowleges that it started, it throws that timeout error.

That being said, what about some specifics? Hardware? Job design? When you say 'fourth job' how many actual processes are we talking about, up to and including that point?
-craig

"You can never have too many knives" -- Logan Nine Fingers
novice_pgr
Participant
Posts: 8
Joined: Thu Jan 19, 2006 11:45 pm

statistics info

Post by novice_pgr »

Craig

System = SunOS
Node = ssi2
Release = 5.8
KernelID = Generic_108528-18
Machine = sun4u
NumCPU = 2


Job design is to do an insert/update for the target table. But the sequence has 18 jobs within it to be invoked. So 18 jobs will be started parallely .

So, any fixes for this ? Some where in some config file ..any parameters to be changed :wink:
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

As an administrator setting, 'under Operator specific' an option called DSIPC_OPEN_TIMEOUT is available, which by default it might be 30. This can be increased to 300 - 600.
But, apparently you need to nail down the cause of the time out. If it is really the lack of resource, it may not be advisable to run all the jobs paralle. Split up your sequence or reschedule your jobs or try to invoke 'SLEEP nn' command in a Execute command activity to slow down the pace of calling all the job at a strech.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
kwwilliams
Participant
Posts: 437
Joined: Fri Oct 21, 2005 10:00 pm

Post by kwwilliams »

18 jobs at one time is a bit much for a 2 CPU box. Each job is contending for time on the CPU. I would run your jobs in groups of 4-5. Keeping in mind that when you move this into a production environment there are going to be other jobs running other than the one that you have created here. So not only do you need to keep in mind the design of your job, but the design of jobs running in your current production environment.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Keep in mind that - potentially - every stage in a parallel job requires one process on each processing node in the configuration file, not to mention one section leader process per job per processing node and one conductor process per job on the conductor node. Multiply this by 18 and you have way too many processes for a two-CPU machine. Upgrade to at least a 32-CPU machine.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Keep in mind that - potentially - every stage in a parallel job requires one process on each processing node in the configuration file, not to mention one section leader process per job per processing node and one conductor process per job on the conductor node. Multiply this by 18 and you have way too many processes for a two-CPU machine. Upgrade to at least a 32-CPU machine. Or a cluster of 16 two-CPU machines.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
novice_pgr
Participant
Posts: 8
Joined: Thu Jan 19, 2006 11:45 pm

Post by novice_pgr »

i want to kill the process which are currently run by one user and start running them fresh.

if i do ps-ef | grep <user>
i get a process like that which i dont have permission to kill it.

Can u tell me wht this process is doin ?

/u01/appl/DataStage/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag ssi2 10

When i try running the jobs even when no other jobs are been run . i get this error. So want to make sure , for that userid there are no currently any active processes running.
rasi
Participant
Posts: 464
Joined: Fri Oct 25, 2002 1:33 am
Location: Australia, Sydney

Post by rasi »

Post a separate thread on the forum. Do a search on how to kill process. It's been covered many time in the forum
Regards
Siva

Listening to the Learned

"The most precious wealth is the wealth acquired by the ear Indeed, of all wealth that wealth is the crown." - Thirukural By Thiruvalluvar
Post Reply