Memory fault handling

rverharen · Post by **rverharen** » Wed Jan 26, 2011 4:12 am

Each night a project runs and one job (extracting from oracle based on a join on large tables and moving data to another oracle table) leds to the following error

Contents of phantom output file =>
RT_SC177/OshExecuter.sh[20]: 2924746 Memory fault(coredump)
Parallel job reports failure (code 139)

Three other jobs run at the same time.
When we do a restart after the job aborted the job finishes succesfully.

Is this really due to the maximum of memory that can be used (we doubt that as we have other projects in which far more processes run at the same time).
On the other hand, if it would be because of the job then it is strange that a restart would solve it (i checked on other posts and the query does not contain a statement with the same column mentioned twice in the select).

Any suggestions on how to proceed from this point?

chulett · Post by **chulett** » Wed Jan 26, 2011 7:03 am

Get your support provider to analyze the crash dump?

Defer the start of this job until after the three other jobs complete?