Why does the job reports as successful with Fatal Messages?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
nkreddy
Premium Member
Premium Member
Posts: 23
Joined: Mon Jun 21, 2004 7:12 am
Location: New York

Why does the job reports as successful with Fatal Messages?

Post by nkreddy »

Database: Oracle 9i

Write Method Used: Load and Append

Index Mode: Rebuild

The problem:

Parallel job reports successful completion even though I get messages like this. I do understand that we have to make ControlM and the Shell script which runs the DS jobs robust enough to abend we get the Fatal errors like this..

The first time I got this Fatal Error, I asked the DBA to increase the TEMP space.

Code: Select all

TEST_TABLE: Oracle call failed; sqlcode = -12801; message: ORA-12801: error signaled in parallel query server P000
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP

TEST_TABLE: Index `SYS_C001634' on table `TEST_TABLE_B' has NOT been rebuilt.
Now I get this, which I am not sure of...

Code: Select all

TEST_TABLE,0: GenericQuery:esqlErrorHandler
Prepare failed for: GenericStmt_9
query is: SELECT banner FROM V$VERSION
sqlcode is: -3113
esql complaint: ORA-03113: end-of-file on communication channel
.
TEST_TABLE,0: Cannot find the version in v$version.

TEST_TABLE,0: GenericQuery:esqlErrorHandler
Prepare failed for: GenericStmt_10
query is: SELECT banner FROM V$VERSION
sqlcode is: -3114
esql complaint: ORA-03114: not connected to ORACLE
I am trying to understand why does the Parallel job reports as successful if there are Fatal messages in the log. I believe the job should abort..It does abort in many cases where there are fatal messages. Is there a reason for this?

I did search the forum and couldn't find a good reason behind this.

Please advice..

Thank You.
seanc217
Premium Member
Premium Member
Posts: 188
Joined: Thu Sep 15, 2005 9:22 am

Post by seanc217 »

It sounds like your Oracle connection is timing out. We were having some issues with this as well on long running jobs. We finally turned time outs off. You will have to work with your DBA on this. In short we added the following entry on our client(DS server) and Server (Oracle Server) in the sqlnet.ora file....

sqlnet.expire_time=0


Hope this helps.
nkreddy
Premium Member
Premium Member
Posts: 23
Joined: Mon Jun 21, 2004 7:12 am
Location: New York

Post by nkreddy »

Thank You Sean..

I will incorporate this and run it again...But my question still remains -- Why does Parallel job reports successful at the end? It is very interesting to notice that this is not consistent with other Fatal messages. Do we have to get a patch from Ascential?
seanc217
Premium Member
Premium Member
Posts: 188
Joined: Thu Sep 15, 2005 9:22 am

Post by seanc217 »

I am not 100% sure why it does this. I have seen it with the infamous 139 error message. The only thing I can think is that DataStage sees that the job finished successfully then hits the timeout and since it is not a DataStage error, but an Oracle error it reports that something happened and finished with a success code. That's the best I can come up with.

Sean
Post Reply