Page 1 of 1

Job is being strucked

Posted: Fri Apr 25, 2008 3:35 am
by Sravani
Hi Gurus,
When I am running a job from the Unix prompt, it is being initiated. Then OSH Script is intiated. From there it is not moving further. It is being strucked.
How can we track the status and why it is being strucked like that?

Thanks.

Posted: Fri Apr 25, 2008 3:44 am
by ArndW
Identify the process ids and then use the "truss -p {pid}" command to see what these processes might be doing.
Does this mean if you start the job from the director it does not get "stuck"? What does the job do?

Posted: Sun Apr 27, 2008 9:14 am
by priyadarshikunal
ArndW wrote:Identify the process ids and then use the "truss -p {pid}" command to see what these processes might be doing.
Does this mean if you start the job from the director it does not get "stuck"? What does the job do?

Hi Arnd,

sorry for posting my query in this thread. But i think its same.

I am getting the same problem i am not able to undestand what is going on inside.

While analyzing truss output i can see the sequence is running fine.

The is also running but i am unable to anlayze it completely.

output of truss -p for sequence

Code: Select all

[iehibu12] /u01/iisGIDev truss -p 1482774
_nsleep(0x00000000, 0x00000000) (sleeping...)
_nsleep(0x00000000, 0x00000000)                 = 0
sigprocmask(0, 0x00000000, 0x302DBE34)          = 0
klseek(39, 0, 6144, 0x00000000)                 = 0
kread(39, "\0\018 $\0\0\0 $\0\0\b C".., 2048)   = 2048
klseek(39, 0, 83968, 0x00000000)                = 0
kread(39, "\001 H T\0\018 T\0\0\b03".., 2048)   = 2048
klseek(39, 0, 110592, 0x00000000)               = 0
kread(39, "\001B0 L\001 H L\0\0\b03".., 2048)   = 2048
klseek(39, 0, 114688, 0x00000000)               = 0
kread(39, "\001C0 L\001B0 L\0\0\f03".., 2048)   = 2048
klseek(39, 0, 131072, 0x00000000)               = 0
kread(39, "\002\0 @\001C0 @\0\0\f03".., 2048)   = 2048
klseek(39, 0, 145408, 0x00000000)               = 0
kread(39, "\002 8 @\002\0 @\0\0\f03".., 2048)   = 2048
klseek(39, 0, 151552, 0x00000000)               = 0
kread(39, "\002 P H\002 8 H\0\0\f03".., 2048)   = 2048
klseek(39, 0, 178176, 0x00000000)               = 0
kread(39, "\002B8 H\002 P H\0\0\f03".., 2048)   = 2048
klseek(39, 0, 192512, 0x00000000)               = 0
kread(39, "\002F0 P\002B8 P\0\0\f03".., 2048)   = 2048
klseek(39, 0, 272384, 0x00000000)               = 0
kread(39, "\004 ( H\002F0 H\0\0\f03".., 2048)   = 2048
then i tried to run truss -p on job's process
here the truss was unable to control that process

so i tried

Code: Select all

nice -5 truss -p 
the result is as follows

Code: Select all

nice -5 truss -p 1523956
kread(0, 0x00000000, 0)                         Err#82 ERESTART
    Received signal #14, SIGALRM [caught]
sigprocmask(2, 0x300BF840, 0x00000000)          = 0
incinterval(0, 0x2FF1A428, 0x2FF1A438)          = 0
sigprocmask(0, 0x00000000, 0x2FF1A420)          = 0
sigprocmask(2, 0xF0464790, 0x2FF1A380)          = 0
_sigaction(14, 0x2FF1A440, 0x2FF1A430)          = 0
thread_setmymask_fast(0x00000000, 0x00000000, 0x00000000, 0x103EE005, 0x00000000, 0x00000164, 0x00000000, 0x00000000) = 0x00000000
appgettimer(9, 0x2FF1A500)                      = 0
sigprocmask(0, 0x00000000, 0x3020B004)          = 0
klseek(42, 0, 2048, 0x00000000)                 = 0
kread(42, "\0\0\t10\0\00110\0\0\f03".., 2048)   = 2048
sigprocmask(2, 0x3020B004, 0x00000000)          = 0
sigprocmask(0, 0x00000000, 0x2FF1A420)          = 0
sigprocmask(2, 0xF0464790, 0x2FF1A380)          = 0
_sigaction(14, 0x2FF1A440, 0x2FF1A430)          = 0
thread_setmymask_fast(0x00000000, 0x00000000, 0x00000000, 0x103EE005, 0x00000000, 0x000000C1, 0x00000000, 0x00000000) = 0x00000000
incinterval(0, 0x2FF1A428, 0x2FF1A438)          = 0
sigprocmask(0, 0x00000000, 0x300BF840)          = 0
kread(46, " # # I   I I S - D S E E".., 4096) (sleeping...)
kread(46, " # # I   I I S - D S E E".., 4096)   Err#82 ERESTART
    Received signal #14, SIGALRM [caught]
sigprocmask(2, 0x300BF840, 0x00000000)          = 0
incinterval(0, 0x2FF1A428, 0x2FF1A438)          = 0
sigprocmask(0, 0x00000000, 0x2FF1A420)          = 0
sigprocmask(2, 0xF0464790, 0x2FF1A380)          = 0
_sigaction(14, 0x2FF1A440, 0x2FF1A430)          = 0
thread_setmymask_fast(0x00000000, 0x00000000, 0x00000000, 0x103EE005, 0x00000000, 0x000000C6, 0x00000000, 0x00000000) = 0x00000000
appgettimer(9, 0x2FF1A500)                      = 0
sigprocmask(0, 0x00000000, 0x30273E64)          = 0
klseek(42, 0, 2048, 0x00000000)                 = 0
kread(42, "\0\0\t10\0\00110\0\0\f03".., 2048)   = 2048
sigprocmask(2, 0x30273E64, 0x00000000)          = 0
sigprocmask(0, 0x00000000, 0x2FF1A420)          = 0
sigprocmask(2, 0xF0464790, 0x2FF1A380)          = 0
_sigaction(14, 0x2FF1A440, 0x2FF1A430)          = 0
thread_setmymask_fast(0x00000000, 0x00000000, 0x00000000, 0x103EE005, 0x00000000, 0x000000C0, 0x00000000, 0x00000000) = 0x00000000
incinterval(0, 0x2FF1A428, 0x2FF1A438)          = 0
sigprocmask(0, 0x00000000, 0x300BF840)          = 0
kread(46, " # # I   I I S - D S E E".., 4096) (sleeping...)
kread(46, " # # I   I I S - D S E E".., 4096)   Err#82 ERESTART
    Received signal #14, SIGALRM [caught]
sigprocmask(2, 0x300BF840, 0x00000000)          = 0
incinterval(0, 0x2FF1A428, 0x2FF1A438)          = 0
sigprocmask(0, 0x00000000, 0x2FF1A420)          = 0
sigprocmask(2, 0xF0464790, 0x2FF1A380)          = 0
_sigaction(14, 0x2FF1A440, 0x2FF1A430)          = 0
thread_setmymask_fast(0x00000000, 0x00000000, 0x00000000, 0x103EE005, 0x00000000, 0x000000BC, 0x00000000, 0x00000000) = 0x00000000
appgettimer(9, 0x2FF1A500)                      = 0
sigprocmask(0, 0x00000000, 0x3020B004)          = 0
klseek(42, 0, 2048, 0x00000000)                 = 0
kread(42, "\0\0\t10\0\00110\0\0\f03".., 2048)   = 2048
sigprocmask(2, 0x3020B004, 0x00000000)          = 0
sigprocmask(0, 0x00000000, 0x2FF1A420)          = 0
sigprocmask(2, 0xF0464790, 0x2FF1A380)          = 0
_sigaction(14, 0x2FF1A440, 0x2FF1A430)          = 0
thread_setmymask_fast(0x00000000, 0x00000000, 0x00000000, 0x103EE005, 0x00000000, 0x000000B7, 0x00000000, 0x00000000) = 0x00000000
incinterval(0, 0x2FF1A428, 0x2FF1A438)          = 0
sigprocmask(0, 0x00000000, 0x300BF840)          = 0
kread(46, " # # I   I I S - D S E E".., 4096) (sleeping...)
kread(46, " # # I   I I S - D S E E".., 4096)   Err#82 ERESTART
    Received signal #14, SIGALRM [caught]
sigprocmask(2, 0x300BF840, 0x00000000)          = 0
incinterval(0, 0x2FF1A428, 0x2FF1A438)          = 0
sigprocmask(0, 0x00000000, 0x2FF1A420)          = 0
sigprocmask(2, 0xF0464790, 0x2FF1A380)          = 0
_sigaction(14, 0x2FF1A440, 0x2FF1A430)          = 0
thread_setmymask_fast(0x00000000, 0x00000000, 0x00000000, 0x103EE005, 0x00000000, 0x000000C1, 0x00000000, 0x00000000) = 0x00000000
appgettimer(9, 0x2FF1A500)                      = 0
sigprocmask(0, 0x00000000, 0x30273E64)          = 0
klseek(42, 0, 2048, 0x00000000)                 = 0
kread(42, "\0\0\t10\0\00110\0\0\f03".., 2048)   = 2048
sigprocmask(2, 0x30273E64, 0x00000000)          = 0
sigprocmask(0, 0x00000000, 0x2FF1A420)          = 0
sigprocmask(2, 0xF0464790, 0x2FF1A380)          = 0
_sigaction(14, 0x2FF1A440, 0x2FF1A430)          = 0
thread_setmymask_fast(0x00000000, 0x00000000, 0x00000000, 0x103EE005, 0x00000000, 0x000000C5, 0x00000000, 0x00000000) = 0x00000000
incinterval(0, 0x2FF1A428, 0x2FF1A438)          = 0
sigprocmask(0, 0x00000000, 0x300BF840)          = 0
^CPstatus: process is not stopped
What this job does?

It performs change capture and inserts the changed records to Oracle.

Generally this job takes around 1 minute (or 2seconds without any records)

I also tried truss -p on sqlldr process but
the out put was

Code: Select all

[iehibu12] /u01/iisGIDev truss -p 3063884
kread(0, 0x0000000000000000, 0) (sleeping...)
^CPstatus: process is not stopped
i think sqlldr is in sleep infinitely.

Please suggest the next step.

Regards,

Posted: Sun Apr 27, 2008 10:00 am
by ArndW
In this case the truss output isn't of any help. I wonder if the job is not progressing because it is waiting for Oracle to complete? If you change the job to not write any changes to the database does it complete? Can you have your DBA monitor the DB while this job is running?

Posted: Sun Apr 27, 2008 10:38 am
by priyadarshikunal
ArndW wrote:In this case the truss output isn't of any help. I wonder if the job is not progressing because it is waiting for Oracle to complete? If you change the job to not write any changes to the database does it complete? Can you have your DBA monitor the DB while this job is running?
I tried that job again it ran successfully.

According to my analysis
Only those jobs are getting hanged which are designed to write in to the database.

DBA was not monitoring the database while running that job but when the job got hanged i tried to consult the DBA.
After investigation he told that the Database it waiting for input but not getting it from the server.

But I was trying to run that job without any record so it should have finished in 2 or 3 seconds.

I am not sure where it got stucked.

That's why I am not sure that at which point it got stucked.

Can you tell me what check list to be followed now to find the point of failure.