Hi Gurus,
When I am running a job from the Unix prompt, it is being initiated. Then OSH Script is intiated. From there it is not moving further. It is being strucked.
How can we track the status and why it is being strucked like that?
Thanks.
Job is being strucked
Moderators: chulett, rschirm, roy
Identify the process ids and then use the "truss -p {pid}" command to see what these processes might be doing.
Does this mean if you start the job from the director it does not get "stuck"? What does the job do?
Does this mean if you start the job from the director it does not get "stuck"? What does the job do?
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Premium Member
- Posts: 1735
- Joined: Thu Mar 01, 2007 5:44 am
- Location: Troy, MI
ArndW wrote:Identify the process ids and then use the "truss -p {pid}" command to see what these processes might be doing.
Does this mean if you start the job from the director it does not get "stuck"? What does the job do?
Hi Arnd,
sorry for posting my query in this thread. But i think its same.
I am getting the same problem i am not able to undestand what is going on inside.
While analyzing truss output i can see the sequence is running fine.
The is also running but i am unable to anlayze it completely.
output of truss -p for sequence
Code: Select all
[iehibu12] /u01/iisGIDev truss -p 1482774
_nsleep(0x00000000, 0x00000000) (sleeping...)
_nsleep(0x00000000, 0x00000000) = 0
sigprocmask(0, 0x00000000, 0x302DBE34) = 0
klseek(39, 0, 6144, 0x00000000) = 0
kread(39, "\0\018 $\0\0\0 $\0\0\b C".., 2048) = 2048
klseek(39, 0, 83968, 0x00000000) = 0
kread(39, "\001 H T\0\018 T\0\0\b03".., 2048) = 2048
klseek(39, 0, 110592, 0x00000000) = 0
kread(39, "\001B0 L\001 H L\0\0\b03".., 2048) = 2048
klseek(39, 0, 114688, 0x00000000) = 0
kread(39, "\001C0 L\001B0 L\0\0\f03".., 2048) = 2048
klseek(39, 0, 131072, 0x00000000) = 0
kread(39, "\002\0 @\001C0 @\0\0\f03".., 2048) = 2048
klseek(39, 0, 145408, 0x00000000) = 0
kread(39, "\002 8 @\002\0 @\0\0\f03".., 2048) = 2048
klseek(39, 0, 151552, 0x00000000) = 0
kread(39, "\002 P H\002 8 H\0\0\f03".., 2048) = 2048
klseek(39, 0, 178176, 0x00000000) = 0
kread(39, "\002B8 H\002 P H\0\0\f03".., 2048) = 2048
klseek(39, 0, 192512, 0x00000000) = 0
kread(39, "\002F0 P\002B8 P\0\0\f03".., 2048) = 2048
klseek(39, 0, 272384, 0x00000000) = 0
kread(39, "\004 ( H\002F0 H\0\0\f03".., 2048) = 2048
here the truss was unable to control that process
so i tried
Code: Select all
nice -5 truss -p
Code: Select all
nice -5 truss -p 1523956
kread(0, 0x00000000, 0) Err#82 ERESTART
Received signal #14, SIGALRM [caught]
sigprocmask(2, 0x300BF840, 0x00000000) = 0
incinterval(0, 0x2FF1A428, 0x2FF1A438) = 0
sigprocmask(0, 0x00000000, 0x2FF1A420) = 0
sigprocmask(2, 0xF0464790, 0x2FF1A380) = 0
_sigaction(14, 0x2FF1A440, 0x2FF1A430) = 0
thread_setmymask_fast(0x00000000, 0x00000000, 0x00000000, 0x103EE005, 0x00000000, 0x00000164, 0x00000000, 0x00000000) = 0x00000000
appgettimer(9, 0x2FF1A500) = 0
sigprocmask(0, 0x00000000, 0x3020B004) = 0
klseek(42, 0, 2048, 0x00000000) = 0
kread(42, "\0\0\t10\0\00110\0\0\f03".., 2048) = 2048
sigprocmask(2, 0x3020B004, 0x00000000) = 0
sigprocmask(0, 0x00000000, 0x2FF1A420) = 0
sigprocmask(2, 0xF0464790, 0x2FF1A380) = 0
_sigaction(14, 0x2FF1A440, 0x2FF1A430) = 0
thread_setmymask_fast(0x00000000, 0x00000000, 0x00000000, 0x103EE005, 0x00000000, 0x000000C1, 0x00000000, 0x00000000) = 0x00000000
incinterval(0, 0x2FF1A428, 0x2FF1A438) = 0
sigprocmask(0, 0x00000000, 0x300BF840) = 0
kread(46, " # # I I I S - D S E E".., 4096) (sleeping...)
kread(46, " # # I I I S - D S E E".., 4096) Err#82 ERESTART
Received signal #14, SIGALRM [caught]
sigprocmask(2, 0x300BF840, 0x00000000) = 0
incinterval(0, 0x2FF1A428, 0x2FF1A438) = 0
sigprocmask(0, 0x00000000, 0x2FF1A420) = 0
sigprocmask(2, 0xF0464790, 0x2FF1A380) = 0
_sigaction(14, 0x2FF1A440, 0x2FF1A430) = 0
thread_setmymask_fast(0x00000000, 0x00000000, 0x00000000, 0x103EE005, 0x00000000, 0x000000C6, 0x00000000, 0x00000000) = 0x00000000
appgettimer(9, 0x2FF1A500) = 0
sigprocmask(0, 0x00000000, 0x30273E64) = 0
klseek(42, 0, 2048, 0x00000000) = 0
kread(42, "\0\0\t10\0\00110\0\0\f03".., 2048) = 2048
sigprocmask(2, 0x30273E64, 0x00000000) = 0
sigprocmask(0, 0x00000000, 0x2FF1A420) = 0
sigprocmask(2, 0xF0464790, 0x2FF1A380) = 0
_sigaction(14, 0x2FF1A440, 0x2FF1A430) = 0
thread_setmymask_fast(0x00000000, 0x00000000, 0x00000000, 0x103EE005, 0x00000000, 0x000000C0, 0x00000000, 0x00000000) = 0x00000000
incinterval(0, 0x2FF1A428, 0x2FF1A438) = 0
sigprocmask(0, 0x00000000, 0x300BF840) = 0
kread(46, " # # I I I S - D S E E".., 4096) (sleeping...)
kread(46, " # # I I I S - D S E E".., 4096) Err#82 ERESTART
Received signal #14, SIGALRM [caught]
sigprocmask(2, 0x300BF840, 0x00000000) = 0
incinterval(0, 0x2FF1A428, 0x2FF1A438) = 0
sigprocmask(0, 0x00000000, 0x2FF1A420) = 0
sigprocmask(2, 0xF0464790, 0x2FF1A380) = 0
_sigaction(14, 0x2FF1A440, 0x2FF1A430) = 0
thread_setmymask_fast(0x00000000, 0x00000000, 0x00000000, 0x103EE005, 0x00000000, 0x000000BC, 0x00000000, 0x00000000) = 0x00000000
appgettimer(9, 0x2FF1A500) = 0
sigprocmask(0, 0x00000000, 0x3020B004) = 0
klseek(42, 0, 2048, 0x00000000) = 0
kread(42, "\0\0\t10\0\00110\0\0\f03".., 2048) = 2048
sigprocmask(2, 0x3020B004, 0x00000000) = 0
sigprocmask(0, 0x00000000, 0x2FF1A420) = 0
sigprocmask(2, 0xF0464790, 0x2FF1A380) = 0
_sigaction(14, 0x2FF1A440, 0x2FF1A430) = 0
thread_setmymask_fast(0x00000000, 0x00000000, 0x00000000, 0x103EE005, 0x00000000, 0x000000B7, 0x00000000, 0x00000000) = 0x00000000
incinterval(0, 0x2FF1A428, 0x2FF1A438) = 0
sigprocmask(0, 0x00000000, 0x300BF840) = 0
kread(46, " # # I I I S - D S E E".., 4096) (sleeping...)
kread(46, " # # I I I S - D S E E".., 4096) Err#82 ERESTART
Received signal #14, SIGALRM [caught]
sigprocmask(2, 0x300BF840, 0x00000000) = 0
incinterval(0, 0x2FF1A428, 0x2FF1A438) = 0
sigprocmask(0, 0x00000000, 0x2FF1A420) = 0
sigprocmask(2, 0xF0464790, 0x2FF1A380) = 0
_sigaction(14, 0x2FF1A440, 0x2FF1A430) = 0
thread_setmymask_fast(0x00000000, 0x00000000, 0x00000000, 0x103EE005, 0x00000000, 0x000000C1, 0x00000000, 0x00000000) = 0x00000000
appgettimer(9, 0x2FF1A500) = 0
sigprocmask(0, 0x00000000, 0x30273E64) = 0
klseek(42, 0, 2048, 0x00000000) = 0
kread(42, "\0\0\t10\0\00110\0\0\f03".., 2048) = 2048
sigprocmask(2, 0x30273E64, 0x00000000) = 0
sigprocmask(0, 0x00000000, 0x2FF1A420) = 0
sigprocmask(2, 0xF0464790, 0x2FF1A380) = 0
_sigaction(14, 0x2FF1A440, 0x2FF1A430) = 0
thread_setmymask_fast(0x00000000, 0x00000000, 0x00000000, 0x103EE005, 0x00000000, 0x000000C5, 0x00000000, 0x00000000) = 0x00000000
incinterval(0, 0x2FF1A428, 0x2FF1A438) = 0
sigprocmask(0, 0x00000000, 0x300BF840) = 0
^CPstatus: process is not stopped
It performs change capture and inserts the changed records to Oracle.
Generally this job takes around 1 minute (or 2seconds without any records)
I also tried truss -p on sqlldr process but
the out put was
Code: Select all
[iehibu12] /u01/iisGIDev truss -p 3063884
kread(0, 0x0000000000000000, 0) (sleeping...)
^CPstatus: process is not stopped
Please suggest the next step.
Regards,
Priyadarshi Kunal
Genius may have its limitations, but stupidity is not thus handicapped.
Genius may have its limitations, but stupidity is not thus handicapped.
In this case the truss output isn't of any help. I wonder if the job is not progressing because it is waiting for Oracle to complete? If you change the job to not write any changes to the database does it complete? Can you have your DBA monitor the DB while this job is running?
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Premium Member
- Posts: 1735
- Joined: Thu Mar 01, 2007 5:44 am
- Location: Troy, MI
I tried that job again it ran successfully.ArndW wrote:In this case the truss output isn't of any help. I wonder if the job is not progressing because it is waiting for Oracle to complete? If you change the job to not write any changes to the database does it complete? Can you have your DBA monitor the DB while this job is running?
According to my analysis
Only those jobs are getting hanged which are designed to write in to the database.
DBA was not monitoring the database while running that job but when the job got hanged i tried to consult the DBA.
After investigation he told that the Database it waiting for input but not getting it from the server.
But I was trying to run that job without any record so it should have finished in 2 or 3 seconds.
I am not sure where it got stucked.
That's why I am not sure that at which point it got stucked.
Can you tell me what check list to be followed now to find the point of failure.
Priyadarshi Kunal
Genius may have its limitations, but stupidity is not thus handicapped.
Genius may have its limitations, but stupidity is not thus handicapped.