Sub Sequence will improve performance
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 222
- Joined: Tue Aug 30, 2005 2:07 am
- Location: pune
- Contact:
Sub Sequence will improve performance
Hi Folks,
My sequence is getting aborted some times and succesfully completed some times. I have nearly 30 Jobs in my sequence. In that 10 jobs have big derivations. Some times Derivation 1 failing and some times Derivation 1 will run succesfully, but Deriavtion 2 failing. Some times sequence completing without problems.
Its giving Abort message like node_node1: Player 12 terminated unexpectedly. for all my aborting jobs.
I am calling the 3 deriavtions once parallely. like that I have 3 Sequencers in side a Sequence.
I am trying to do the follwing implementations to solve this issue.
1) run the derivations one by one that means sequentially.
2) Create one more sub sequence in side the sequence for all derivation jobs.
which implementation is better to come out of the problem ?
Your inputs are more appreciated.
Thanks & Regards
Nagesh.
My sequence is getting aborted some times and succesfully completed some times. I have nearly 30 Jobs in my sequence. In that 10 jobs have big derivations. Some times Derivation 1 failing and some times Derivation 1 will run succesfully, but Deriavtion 2 failing. Some times sequence completing without problems.
Its giving Abort message like node_node1: Player 12 terminated unexpectedly. for all my aborting jobs.
I am calling the 3 deriavtions once parallely. like that I have 3 Sequencers in side a Sequence.
I am trying to do the follwing implementations to solve this issue.
1) run the derivations one by one that means sequentially.
2) Create one more sub sequence in side the sequence for all derivation jobs.
which implementation is better to come out of the problem ?
Your inputs are more appreciated.
Thanks & Regards
Nagesh.
NageshSunkoji
If you know anything SHARE it.............
If you Don't know anything LEARN it...............
If you know anything SHARE it.............
If you Don't know anything LEARN it...............
Nagesh,
you can't design any type of a workaround until you solve the cause of your jobs aborting. The error message doesn't mean much; it is like trying to diagnose the reason why an engine isn't running using the red "motor" light on the panel.
Have you tried any diagnosis on the issue yet?
you can't design any type of a workaround until you solve the cause of your jobs aborting. The error message doesn't mean much; it is like trying to diagnose the reason why an engine isn't running using the red "motor" light on the panel.
Have you tried any diagnosis on the issue yet?
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 222
- Joined: Tue Aug 30, 2005 2:07 am
- Location: pune
- Contact:
Arnd,
Thanks for your reply.
I have tried to Identify the problem. IBut, the main thing is the same sequence is working finely in Dev. But, its failing in SYS and UAT. got the following abort message for all my aborting jobs.
node_node1: Player 12 terminated unexpectedly.
Fatal Error: waitForWriteSignal(): Premature EOF on node servername
buffer(10),1: Error in writeBlock - could not write 32
Fatal Error: APT_BufferOperator::writeAllData() write failed. This is probably due to a downstream operator failure.
Lookup1 : Fatal Error: Unable to allocate communication resources
Error in writeBlock - could not write 32
I got all above fatal errors. I thought that, the above fatal errors are due to running of my 3 derivation jobs parallely. that's Y I am planning to run sequentially.
your inputs are more valuable.
Thanks for your reply.
I have tried to Identify the problem. IBut, the main thing is the same sequence is working finely in Dev. But, its failing in SYS and UAT. got the following abort message for all my aborting jobs.
node_node1: Player 12 terminated unexpectedly.
Fatal Error: waitForWriteSignal(): Premature EOF on node servername
buffer(10),1: Error in writeBlock - could not write 32
Fatal Error: APT_BufferOperator::writeAllData() write failed. This is probably due to a downstream operator failure.
Lookup1 : Fatal Error: Unable to allocate communication resources
Error in writeBlock - could not write 32
I got all above fatal errors. I thought that, the above fatal errors are due to running of my 3 derivation jobs parallely. that's Y I am planning to run sequentially.
your inputs are more valuable.
NageshSunkoji
If you know anything SHARE it.............
If you Don't know anything LEARN it...............
If you know anything SHARE it.............
If you Don't know anything LEARN it...............
Be ready to take up both the option. You need to find the maximum usage of CPU at each instance and schedule the jobs accordingly. You can split up the job, and run it paralley till you reach the CPU usage at prescribed limit.
Do think about the number of process running altogether in every node for the jobs that been called in that instance.
Try to speak with your Unix SA to increase Swap space and cache if required.
Do think about the number of process running altogether in every node for the jobs that been called in that instance.
Try to speak with your Unix SA to increase Swap space and cache if required.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
-
- Participant
- Posts: 222
- Joined: Tue Aug 30, 2005 2:07 am
- Location: pune
- Contact:
Hi Kumar,
Thanx for your response.
I have splitted the sequence in to two parts and I have kept some jobs into the One sequence and other jobs in to the other sequence. Now, I am waiting for result.
Regards
Nagesh
Thanx for your response.
I have splitted the sequence in to two parts and I have kept some jobs into the One sequence and other jobs in to the other sequence. Now, I am waiting for result.
Regards
Nagesh
NageshSunkoji
If you know anything SHARE it.............
If you Don't know anything LEARN it...............
If you know anything SHARE it.............
If you Don't know anything LEARN it...............
Nagesh,
As mentioned by you, the same set of jobs are running fine in DEV env but failing in UAT. I suggest you try to figure out if there is any significant diff in the cofiguration (DS, OS and DB) betn the two environment. Monitor the resource usage while running your jobs. As mentioned by Kumar, check if the swap space is set to an inappropriate value. Best thing would be 'Catch your Sys Admin'
As mentioned by you, the same set of jobs are running fine in DEV env but failing in UAT. I suggest you try to figure out if there is any significant diff in the cofiguration (DS, OS and DB) betn the two environment. Monitor the resource usage while running your jobs. As mentioned by Kumar, check if the swap space is set to an inappropriate value. Best thing would be 'Catch your Sys Admin'
Nitin Jain | India
If everything seems to be going well, you have obviously overlooked something.
If everything seems to be going well, you have obviously overlooked something.
-
- Participant
- Posts: 222
- Joined: Tue Aug 30, 2005 2:07 am
- Location: pune
- Contact:
Thanks Nitin for your response.
Hi All,
After Split up of my big sequence in to two sequences, still I am facing the same problem. It's working in Dev and aborting in SYS and OAT for somany times with Unix SIG kill and running after some runs .Can I split those sequences again in to two more sequences ? the total 4 sequences in place of one. But, I am not sure whether it will solve my problem.
As DSXperts suggested to check the swap space and all other things. Can I ask u all what are all I have to check in different environments and How to comapre the both environments ? What are all paremeters are important in UNIX server ?
Your inputs are very valuable to me.
Regards
Nagesh.
Hi All,
After Split up of my big sequence in to two sequences, still I am facing the same problem. It's working in Dev and aborting in SYS and OAT for somany times with Unix SIG kill and running after some runs .Can I split those sequences again in to two more sequences ? the total 4 sequences in place of one. But, I am not sure whether it will solve my problem.
As DSXperts suggested to check the swap space and all other things. Can I ask u all what are all I have to check in different environments and How to comapre the both environments ? What are all paremeters are important in UNIX server ?
Your inputs are very valuable to me.
Regards
Nagesh.
NageshSunkoji
If you know anything SHARE it.............
If you Don't know anything LEARN it...............
If you know anything SHARE it.............
If you Don't know anything LEARN it...............
Nagesh,
Try to turn off the Job Monitor in SYS and UAT via
DS Administrator->Project->Properties->Enviroment->Look for APT_NO_JOB_MON in Reporting section, and change the value to "True" and save it by clicking OK, and then run your job.
Let me know if this fixes your problem.
Try to turn off the Job Monitor in SYS and UAT via
DS Administrator->Project->Properties->Enviroment->Look for APT_NO_JOB_MON in Reporting section, and change the value to "True" and save it by clicking OK, and then run your job.
Let me know if this fixes your problem.
Pneuma Lin.
pneumalin@yahoo.com
pneumalin@yahoo.com
-
- Participant
- Posts: 222
- Joined: Tue Aug 30, 2005 2:07 am
- Location: pune
- Contact:
pneumalin,
Thanks for your response.
I have tried by changing the default value of Environmental Variable APT_MONITOR_SIZE from blank to 100000. But, it's not completely solved my problem. Then I have set the environmental variable APT_NO_JOB_MON to true. Now, my sequence is succesfully completed without any aborts. I have ran one big sequence only. Still my job succesfully completed.
Kudos to pneumalin and others, who gideid me to come out of this problem.
But, still I have one concern If I set APT_MONITOR_SIZE with 100000 and APT_NO_JOB_MON to TRUE. It will affect any performance. Please, let me know your comments on the same.
Regards,
Nagesh.
DSXCHNAGE IS TOO POWERFUL AND AWESOME
Thanks for your response.
I have tried by changing the default value of Environmental Variable APT_MONITOR_SIZE from blank to 100000. But, it's not completely solved my problem. Then I have set the environmental variable APT_NO_JOB_MON to true. Now, my sequence is succesfully completed without any aborts. I have ran one big sequence only. Still my job succesfully completed.
Kudos to pneumalin and others, who gideid me to come out of this problem.
But, still I have one concern If I set APT_MONITOR_SIZE with 100000 and APT_NO_JOB_MON to TRUE. It will affect any performance. Please, let me know your comments on the same.
Regards,
Nagesh.
DSXCHNAGE IS TOO POWERFUL AND AWESOME
NageshSunkoji
If you know anything SHARE it.............
If you Don't know anything LEARN it...............
If you know anything SHARE it.............
If you Don't know anything LEARN it...............
You are welcome! It's always glad to hear someone gets out of their head-cratching problem!
Followings are some advises for you to consider:
1. The performance of sll the other jobs is not impacted by this action. Instead, it will be improved greatly.
2. We turned it off in our production environment on project level since we don't need it, and we believe we don't need to waste the resource in that environment. Job Monitor is a Java class talking to DS job in runtime to collect all the count information and then post it back to DS Engine.
3. It does have a patch to address this probblem in 7.5.1 if you so desired to fix it. Contact your IBM support or direclty upgrade DS Server to 7.5.2.
Cheers.
Followings are some advises for you to consider:
1. The performance of sll the other jobs is not impacted by this action. Instead, it will be improved greatly.
2. We turned it off in our production environment on project level since we don't need it, and we believe we don't need to waste the resource in that environment. Job Monitor is a Java class talking to DS job in runtime to collect all the count information and then post it back to DS Engine.
3. It does have a patch to address this probblem in 7.5.1 if you so desired to fix it. Contact your IBM support or direclty upgrade DS Server to 7.5.2.
Cheers.
Pneuma Lin.
pneumalin@yahoo.com
pneumalin@yahoo.com