Parallel job hanging on oracle SQLloader ?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

siauchun84
Participant
Posts: 63
Joined: Mon Oct 20, 2008 12:01 am
Location: Malaysia

Post by siauchun84 »

When your job hang, have you checked on the task manager in the server on the total numbers of the Osh.exe? I have faced similiar problem before which there are lots of Osh.exe processes still exist in the task.
wbeitler
Premium Member
Premium Member
Posts: 70
Joined: Tue Feb 21, 2006 2:58 am
Location: Netherlands
Contact:

Post by wbeitler »

Thanks for that. Weird problems make you look for weird things...
I'll keep you posted.
wbeitler
Premium Member
Premium Member
Posts: 70
Joined: Tue Feb 21, 2006 2:58 am
Location: Netherlands
Contact:

Post by wbeitler »

siauchun84,
When your job hang, have you checked on the task manager in the server on the total numbers of the Osh.exe?
Aborting the job manually actually did get rid of 12 osh.exe processes.
Care to join how you solved your problem ?
siauchun84
Participant
Posts: 63
Joined: Mon Oct 20, 2008 12:01 am
Location: Malaysia

Post by siauchun84 »

wbeitler wrote:siauchun84,
Aborting the job manually actually did get rid of 12 osh.exe processes.
Care to join how you solved your problem ?
Hi, wbeitler, what I did was created a vbscript batch file to kill all the osh.exe before I trigger my next job seem I was using Windows platform.
*Note: Killing the osh.exe will causing all the running jobs terminated.

You may copy the following into notepad and save as .vbs in the server (Just double click it if you have direct access to the server. If not, create a seq job to trigger the script):
-------- VB Script --------
Dim wql
Dim wmi
Dim oResults

Set wmi = GetObject("winmgmts:")
wql = "SELECT * FROM win32_process WHERE name ='osh.exe'"
Set oResults = wmi.ExecQuery(wql)
For Each Process In oResults
Process.Terminate
Next
set wmi = nothing
---- End of VB Script ----------
wbeitler
Premium Member
Premium Member
Posts: 70
Joined: Tue Feb 21, 2006 2:58 am
Location: Netherlands
Contact:

Post by wbeitler »

Siachun,

thanks for sharing, but no option here since we're running multiple jobs in parallel.

Jobs actually didn't crash after we've enabled the APT_RECORD_COUNTS reporting environment variable. That somehow seems to 'keep the communication alive' ?! Would that make any sense ?!
wbeitler
Premium Member
Premium Member
Posts: 70
Joined: Tue Feb 21, 2006 2:58 am
Location: Netherlands
Contact:

Post by wbeitler »

Doesn't make any sense... Job hanging again, although we now do get the second 'Load complete' message, but the 'Record count' for only 1 of the nodes... Waiting in vain for the second node to return it's record count :cry:
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

So... did you ever bring support into the picture? Have they tried to figure out what in the heck might be going on? Seems like you are well and truly into their terrritory now.
-craig

"You can never have too many knives" -- Logan Nine Fingers
wbeitler
Premium Member
Premium Member
Posts: 70
Joined: Tue Feb 21, 2006 2:58 am
Location: Netherlands
Contact:

Post by wbeitler »

PMR- logged. Still waiting for their answer.
wbeitler
Premium Member
Premium Member
Posts: 70
Joined: Tue Feb 21, 2006 2:58 am
Location: Netherlands
Contact:

Post by wbeitler »

Installed patch JR36567 and Fixpack 3 as adviced by IBM Support.
Didn't fix the problem though... Any new thoughts?

William
Post Reply