Parallel job hanging on oracle SQLloader ?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

wbeitler
Premium Member
Premium Member
Posts: 70
Joined: Tue Feb 21, 2006 2:58 am
Location: Netherlands
Contact:

Parallel job hanging on oracle SQLloader ?

Post by wbeitler »

Morning,

we're running a very simple parallel job, writing to an oracle table using the Load-method. We've got 2 nodes defined in the APT-config. For both nodes we receive the 'Export complete' message along with a rowcount. At times (by the look of it rather random) we're receiving just 1 'Load completed' message and the other node seems to be 'stuck' without any (error-)message. Just waiting in vain to report back the other Load completed message...

SQL-loader situation:
- the 'par' and 'ctl'-files are still in the scratch folder (unexpected)
- no bad-file is produced (as expected)
- log-file is complete and contains no errors (as expected)

Nothing strange in the db2diag.log either. So...
Any other place to look for the possible reason?
Or even better any solutions for this unpredictable behavior? :?

thanks again,

William
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

Turn the operator combination off. And try looking in to oracle trace whether its oracle creating problems. Also, whats the APT_ORACLE_LOAD_OPTIONS environment variable setting.

Do you have any index rebuild or maintenance specified if using Oracle enterprise stage.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
wbeitler
Premium Member
Premium Member
Posts: 70
Joined: Tue Feb 21, 2006 2:58 am
Location: Netherlands
Contact:

Re: Parallel job hanging on oracle SQLloader ?

Post by wbeitler »

Hi,

- APT_ORACLE_LOAD_OPTIONS isn't set explicitly, hence we're running with:

OPTIONS(DIRECT=TRUE, PARALLEL=TRUE, SKIP_INDEX_MAINTENANCE=YES)

- Oracle-trace doesn't show any problems.
- No indexes (or rebuilds) on the table yet. So index-maintenance options not set

Turned the operator combination off and currently rerunning.
But since the problem occures ' once in a while' not sure if we ' Catch' it this time...

thanks for your time and effort,

William
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

I have seen problems like this in 8.0 release and turning operator combination off solved it for us. Didn't really got a satisfactory answer from IBM and since it got solved, haven't really pushed them either.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
teddycarebears
Participant
Posts: 18
Joined: Wed May 12, 2010 11:57 pm

Post by teddycarebears »

What if it is a database problem and a deadlock occurs? Have you searched your tables if they are used in that very moments by another job or process ?
Able was I ere I saw Elba
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

in case of any deadlock, the trace file should have the details and since OP mentioned there is none, i believe its ok.

BTW, skip maintenance is not the default load option, for DIRECT=TRUE datatstage expects the index to be rebuilt or maintained unless SKIP MAINTENANCE is explicity specified. IMO.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
wbeitler
Premium Member
Premium Member
Posts: 70
Joined: Tue Feb 21, 2006 2:58 am
Location: Netherlands
Contact:

Post by wbeitler »

Checked that. No Oracle processes running for that table when the job seemed to be hanging.
Have been constantly looping the 75+ jobs in which the error sometimes occured with the APT_DISABLE_COMBINATION set. No luck in trapping the error again so far...
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

what version of datastage are you using? Have you checked if the process is still running on server machine for that job?
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
wbeitler
Premium Member
Premium Member
Posts: 70
Joined: Tue Feb 21, 2006 2:58 am
Location: Netherlands
Contact:

Post by wbeitler »

We're on 8.1 fixpack 1. No direct access on the ETL-server unfortunately, so couldn't check myself. But sure will get support to find out about serverside processes (if I get it to ' hang' again...)

William
wbeitler
Premium Member
Premium Member
Posts: 70
Joined: Tue Feb 21, 2006 2:58 am
Location: Netherlands
Contact:

Post by wbeitler »

wbeitler wrote:We're on 8.1 fixpack
ehhhrrr 8.1.2 that is... :o
wbeitler
Premium Member
Premium Member
Posts: 70
Joined: Tue Feb 21, 2006 2:58 am
Location: Netherlands
Contact:

Post by wbeitler »

Jobs hanging again. Datastage-processes still up, but not running (no changes in CPU or memory allocation). No DB-processes and no usefull information in the logs after having ran with APT_DISABLE_COMBINATION.
Just the job waiting there for the 'Load completed message' from the second node... Any other clues?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Have you involved your official support provider yet?
-craig

"You can never have too many knives" -- Logan Nine Fingers
wbeitler
Premium Member
Premium Member
Posts: 70
Joined: Tue Feb 21, 2006 2:58 am
Location: Netherlands
Contact:

Post by wbeitler »

Nope, since DSXchange has beaten them in the past more than once... :roll: But you're right. Will give them a fair chance to prove otherwise... In the meantime, still open for suggestions...
wbeitler
Premium Member
Premium Member
Posts: 70
Joined: Tue Feb 21, 2006 2:58 am
Location: Netherlands
Contact:

Post by wbeitler »

Please tell me it can't be a lock on the APT configuration file when several jobs try to access it simultaneously ?! :roll:
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

It can't be a lock on the APT configuration file when several jobs try to access it simultaneously.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply