Memory fault handling

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
rverharen
Premium Member
Premium Member
Posts: 34
Joined: Fri Jan 19, 2007 9:09 am
Location: Breda, The Netherlands

Memory fault handling

Post by rverharen »

Each night a project runs and one job (extracting from oracle based on a join on large tables and moving data to another oracle table) leds to the following error

Contents of phantom output file =>
RT_SC177/OshExecuter.sh[20]: 2924746 Memory fault(coredump)
Parallel job reports failure (code 139)

Three other jobs run at the same time.
When we do a restart after the job aborted the job finishes succesfully.

Is this really due to the maximum of memory that can be used (we doubt that as we have other projects in which far more processes run at the same time).
On the other hand, if it would be because of the job then it is strange that a restart would solve it (i checked on other posts and the query does not contain a statement with the same column mentioned twice in the select).

Any suggestions on how to proceed from this point?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Get your support provider to analyze the crash dump?

Defer the start of this job until after the three other jobs complete?
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply