Timed out while waiting for an event
Posted: Fri Nov 24, 2006 4:08 am
Hi,
Lately we're getting a lot of errors on
wCleanupCBOAggregates..JobControl (@AGR_CBO_ORIGINATING): Controller
problem: Error calling DSRunJob(CleanupAgrCBO), code=-14
[Timed out while waiting for an event] .
Before anyone mentions this: I have did a search and found that this is because of an overload of the system. I've also found the post about ecase 70788 (a patch to set DSD.RUN from 60 seconds
to 600 seconds ) which is offcource a workaround, not a solution
However: If I look at the load of our unix server this is not at it's limits when these errors occur(checked number of processes/CPU/memory/disk space), so it seems more of a datastage overload then a server overload.
Does anyone have an idea about the deciding factor in this?
Example is there a difference between
-50 jobs with 2 sequential stages being started together
-2 jobs with 50 sequential stages being started together
- 2 jobs with 5 stages, each using 10 parallel processes.
this way we can check what the best way is to resolve this: do we mainly sequentialize (if that's a word?) the workflows to start less parallel jobs, do we split jobs into multiple smaller jobs, or do we decrease the parallelism inside the jobs?
Lately we're getting a lot of errors on
wCleanupCBOAggregates..JobControl (@AGR_CBO_ORIGINATING): Controller
problem: Error calling DSRunJob(CleanupAgrCBO), code=-14
[Timed out while waiting for an event] .
Before anyone mentions this: I have did a search and found that this is because of an overload of the system. I've also found the post about ecase 70788 (a patch to set DSD.RUN from 60 seconds
to 600 seconds ) which is offcource a workaround, not a solution
However: If I look at the load of our unix server this is not at it's limits when these errors occur(checked number of processes/CPU/memory/disk space), so it seems more of a datastage overload then a server overload.
Does anyone have an idea about the deciding factor in this?
Example is there a difference between
-50 jobs with 2 sequential stages being started together
-2 jobs with 50 sequential stages being started together
- 2 jobs with 5 stages, each using 10 parallel processes.
this way we can check what the best way is to resolve this: do we mainly sequentialize (if that's a word?) the workflows to start less parallel jobs, do we split jobs into multiple smaller jobs, or do we decrease the parallelism inside the jobs?