Parallel job reports failure (code 132)

rangas999 · Post by **rangas999** » Wed May 04, 2011 6:05 am

Dear all

I am getting following error while running job .which is updating pogress table through ODBC stage.

dataset-->ODBC

Contents of phantom output file =>
RT_SC9/OshExecuter.sh[20]: 360668 Illegal instruction(coredump)

Parallel job reports failure (code 132)

for that i have cheked this file
[/DataStage/product/Ascential/DataStage/Projects/ETA/RT_SC9/OshExecuter.sh

this file contains
======================
#!/bin/sh
# Shell script for Datastage to execute an Orchestrate osh script, generated at 2011-05-04 12:28:40
#
# Parameters:
# $1 - osh script to run (file path)
# $2 - fifo name to write osh diagnostic output
# $3 onwards - osh options (may be absent)
oshscript=$1
oshpipe=$2
shift
shift
if test ! -x "$APT_ORCHHOME/bin/osh"
then echo '##OSHRETVAL NOOSH' > $oshpipe
exit 1
fi
$APT_ORCHHOME/bin/osh "$@" -f $oshscript > $oshpipe 2>&1 &
oshpid=$!
# Write the pid of the conductor process to the fifo
echo '##OSHPID' $oshpid > $oshpipe
wait $oshpid
# Write the terminating string to the fifo
echo '##OSHRETVAL' $? > $oshpipe
# end of script
=======

Can plz suggest me what i need to change in this file..

Thanks in advance

greggknight · Post by **greggknight** » Wed May 04, 2011 7:20 am

First off I would not touch this file, unless of course you are 100% sure of how the px engine works and all the processes it uses when running a job .... so on.

This is the file that creates the pipes and osh processes for your job.

I would look at the core dump and the job.
Something in the job is probable incorrect.

rangas999 · Post by **rangas999** » Wed May 04, 2011 11:02 am

Hi

Thanks for your replay.........

I am getting following error while running job in system test ,which is updating progress table through ODBC stage.

info: Contents of phantom output file =>
RT_SC9/OshExecuter.sh[20]: 360668 Illegal instruction(coredump)

FATAL ERROR: Parallel job reports failure (code 132)

for this i searched..and they r suggested to see that script file..

thats y i given that file..

job design

dataset-->ODBC

code in ODBC stage

UPDATE CLIENT
SET
SPARECHAR3 ='ETR'
WHERE
ACNO=ORCHESTRATe.acno

which is working fine in devlopment..

Plz suggest me wt is solution for this fatal error.

Thanks

greggknight · Post by **greggknight** » Wed May 04, 2011 11:07 am

What is the rest of the job log
and did you look at the core dump?

rangas999 · Post by **rangas999** » Thu May 05, 2011 2:08 am

FATAL:Parallel job reports failure (code 132)
THESE ARE REMAINIG JOB LOGS

INFO:dataset-->ODBC
info:main_program: The open files limit is 2000; raising to 2147483647.
nfo:main_program: orchgeneral: loaded
orchsort: loaded
orchstats: loaded
info:main_program: APT configuration file: /DataStage/product/Ascential/DataStage/Configurations/ETA.apt
{
node "node0" {
fastname "gb02qas56tefxx7" /* node name on a fast network */
pools "" "node0"
resource disk "/DataStage/temp_files/PRD/ETA/Datasets1" {}
resource scratchdisk "/DataStage/temp_files/PRD/Scratch1" {}
}
node "node1" {
fastname "gb02qas56tefxx7"
pools "" "node1"
resource disk "/DataStage/temp_files/PRD/ETA/Datasets2" {}
resource scratchdisk "/DataStage/temp_files/PRD/Scratch2" {}
}
node "node2" {
fastname "gb02qas56tefxx7"
pools "" "node2"
resource disk "/DataStage/temp_files/PRD/ETA/Datasets3" {}
resource scratchdisk "/DataStage/temp_files/PRD/Scratch3" {}
}
node "node3" {
fastname "gb02qas56tefxx7"
pools "" "node3"
resource disk "/DataStage/temp_files/PRD/ETA/Datasets4" {}
resource scratchdisk "/DataStage/temp_files/PRD/Scratch4" {}
}
}
info:Contents of phantom output file =>
RT_SC9/OshExecuter.sh[20]: 360668 Illegal instruction(coredump)

FATAL:Parallel job reports failure (code 132)

synsog · Post by **synsog** » Thu May 05, 2011 6:23 am

see this..

viewtopic.php?t=99257

rangas999 · Post by **rangas999** » Sat May 07, 2011 8:44 am

Hi thanks for reply..

dataset which i m reading ,existing in same path which i have given in

JOB2.

I am able to see that dataset..and one more it is running fine in development envornment..after exporting into system test am getting this error.

my seqence contains two jobs

job1

orcale ------->dataset1

job2

dataset1..............>odbc

in Job2's dataset i am reading data from JOB1 output's dataset.
afetr runnig of first job Dataset is existed ,in the path which i have given in second dataset .

even though dataset is exist after first job run i am gettin bellow error

"Parallel job repot failure code(132)".

Plz suggets me necessary steps...

greggknight · Post by **greggknight** » Sat May 07, 2011 9:07 am

I think the problem is with your config.apt file

If I am looking at this right you have four nodes writting to one controller.
Your scratch disk and resource disk is the same.

Or are these diff. mounts on different controllers.

First thing I would do is move my scratch disk to a different filesystem on a different controller.

the way I see this, not knowing your hardware config all four nodes are reading and writting to the same filesystem at the same time.

rangas999 · Post by **rangas999** » Mon May 09, 2011 1:28 am

Hi thanks for your replay..

In system test we r having 4 nodes, bt in development we r having single node..

and i am getting this error in system test only..

see this my config file in system test named as ETA.apt in this path

/DataStage/product/Ascential/DataStage/Configurations/ETA.apt

ETA.apt file

gb02qas56tefxx7[/DataStage/product/Ascential/DataStage/Configurations]$ more ETA.apt
{
node "node0" {
fastname "gb02qas56tefxx7" /* node name on a fast network */
pools "" "node0"
resource disk "/DataStage/temp_files/PRD/ETA/Datasets1" {}
resource scratchdisk "/DataStage/temp_files/PRD/Scratch1" {}
}
node "node1" {
fastname "gb02qas56tefxx7"
pools "" "node1"
resource disk "/DataStage/temp_files/PRD/ETA/Datasets2" {}
resource scratchdisk "/DataStage/temp_files/PRD/Scratch2" {}
}
node "node2" {
fastname "gb02qas56tefxx7"
pools "" "node2"
resource disk "/DataStage/temp_files/PRD/ETA/Datasets3" {}
resource scratchdisk "/DataStage/temp_files/PRD/Scratch3" {}
}
node "node3" {
fastname "gb02qas56tefxx7"
pools "" "node3"
resource disk "/DataStage/temp_files/PRD/ETA/Datasets4" {}
resource scratchdisk "/DataStage/temp_files/PRD/Scratch4" {}
}
}
ETA.apt: END

can u plz suggest me wt i need to change in this config file and wt is final soultion for this error Parallee job report failure(code 132).

Thanks in advance.

rangas999 · Post by **rangas999** » Mon May 09, 2011 8:23 am

can u plz suggest me wt i need to change in this config file and wt is final soultion for this error Parallee job report failure(code 132).

Thanks in advance.

greggknight · Post by **greggknight** » Mon May 09, 2011 8:43 am

Well first if it works on a single node config and when you move it and try to run it on the 4 node config it fails. Then I would first start simple.
Use a simple 1 node config on the test project.
If that works then you proble need to look at the config.

So I have a couple of questions.
1) Are you the Datastage Admin.
if not I would have them set up a proper config file.
2) How many cpus do you have on this env.
3) how many spindals do you have available.
4) How many concurrent jobs will be running and what kind of stages are in them?
5)What you sent is not the log its one entry in the log.
everything is relevent.

greggknight · Post by **greggknight** » Mon May 09, 2011 8:47 am

Its apparent that if it runs in one environment and not the other then you need to figure out what the differences are between the two environments.
The config is just the start.
If they are different servers then are they installed and configured the same as well.

rangas999 · Post by **rangas999** » Wed May 11, 2011 5:39 am

Hi Thanks for your replay.

This is not configuration file issue.i guess .because first job is runnig fine..in sequnce.n 2nd job we need to read data from first job's output.[dataset].in this case error showing this.

and code 132 means Dataset not exists..i belive.

plz suggest in this regard

plz see this link n see wt ray suggested..n my job job design n sequnce i
explaned in my 3 replay /

viewtopic.php?
t=99467&highlight=Parallel+job+reports+failure+%28code+132%29

Thanks

synsog · Post by **synsog** » Wed May 11, 2011 5:47 am

Are you able to see the data using 'view data' in dataset1 in both job1 and job2?

also in job2 try replacing oracle stage with new dataset/file(just a thought).

rangas999 · Post by **rangas999** » Thu May 12, 2011 5:53 am

Hi Thanks for replay

Actually am getting this error in system test not in dev..in dev it wrking fine

so how do i change dataset over ther in sys?do u want me change in dev n export agin in sys?

suggest me wt i need to do to supress this error

Parallel job reports failure (code 132)

Thanks