Abnormal termination of transformer stage

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

saikrishna
Participant
Posts: 158
Joined: Tue Mar 15, 2005 3:16 am

Abnormal termination of transformer stage

Post by saikrishna »

Hi

When I run a job which selects records from an oracle table and inserts to oracle table, I am getting the following warning, and then job is aborting.

Abnormal termination of stage push_user_table_10pct_m..xfm_stage detected



The design of the job is as follows:

ORACLE Stage -> Transformer ->Oracle Stage

In transformer, I am doing a direct one to one mapping from source stage target stage.

Why this error came ? What could be the solution for it?

FYI: The number of records in source table is : 28499271

Thanks
Sai
saikrishna
Participant
Posts: 158
Joined: Tue Mar 15, 2005 3:16 am

Post by saikrishna »

I am getting the follwoing message in the next run

From previous run
DataStage Job 413 Phantom 3117
Abnormal termination of DataStage.
Fault type is 11. Layer type is BASIC run machine.
Fault occurred in BASIC program JOB.1133835455.DT.1454325365.TRANS1 at address 2fc.
CRITICAL ERROR! Notify the system administrator.



Thanks
Sai
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

First suggestion would be to search for your error message "Fault type is 11" and see if that helps.
-craig

"You can never have too many knives" -- Logan Nine Fingers
saikrishna
Participant
Posts: 158
Joined: Tue Mar 15, 2005 3:16 am

Post by saikrishna »

I have searched the forum on this...

But could not resolve my issue...
My
I tried running my job for 1 lack records, it worked fine.

When I try to run 28 million records, I am getting this problem at around 12 -13 million records.

I have removed the target DB stage, and replaced with sequential file, but stil the problem is not resolved. So, I feel the problem is not with the target db stage.

My job design involves passive-active-passive connection, So there is no active-active link is there..So, I didnt enable in-process or inter-process row bufferings..

FYI: The same job ran successfully in our old server, version 7.5 on a different machine. We migrated the jobs from that server to our new server, i.e. 7.5.2 on a different machine.

Any clue with this info?

Thanks
Sai
saikrishna
Participant
Posts: 158
Joined: Tue Mar 15, 2005 3:16 am

Post by saikrishna »

More info about the comparison of old server and new server:
1. In our new server NLS is installed, old server no NLS installation
2. I have compared uvconfig files in the two servers, I found the following difference:

< GLTABSZ 130
< GLTABSIZE 130
---
> GLTABSZ 120
214,215c213
< RLTABSZ 130
< RLTABSIZE 130
---
> RLTABSZ 120
287c285
< MAXRLOCK 129
---
> MAXRLOCK 119
554c552
< DMEMOFF 0xbdfd3000
---
> DMEMOFF 0x88b8000
563c561
< PMEMOFF 0xbf431000
---
> PMEMOFF 0x61a8000
572c570
< CMEMOFF 0xbf434000
---
> CMEMOFF 0x6b6c000
> CMEMOFF 0x6b6c000
581c579
< NMEMOFF 0xbf635000
---
> NMEMOFF 0x62a2000
618,796d615




And in our newserver uvconfig file, it contains SOME NLS parameters.


Thanks
Sai
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

If you add a constraint "1=2" to your job does it still abort? Does it abort at the same row number every time? If you drop a few columns from the source and don't write anything to the output, does it still fail and does it fail in the same place? Basically the goal is to make simple changes to the job until the error goes away, and then narrow down the cause.
saikrishna
Participant
Posts: 158
Joined: Tue Mar 15, 2005 3:16 am

Post by saikrishna »

Hi ArndW

FYI: The job is not aborting at the same row number.

I will implement 1=2 and the other options you provided.

One thing I want to tell you is It is taking around 2 hours to run this job and then it is aborting... So, I need to wait for 2 hours, if I do any experimentation..

Thanks
Sai
saikrishna
Participant
Posts: 158
Joined: Tue Mar 15, 2005 3:16 am

Post by saikrishna »

Hi

a. The job with 1=2 condition, is aborted at in the middle at around 13 million records. (Overall number of records: 28 million)

b. The job which directly loads from OCI stage to OCI Stage, also aborted.
(Without transformer)

Any other solutions?

Thanks
Sai
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What is the transaction size and array size in the target OCI stage? Is the Oracle database exhausting some resource, such as the size of the rollback segment? (Ask your Oracle DBA to check while this job is running.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
saikrishna
Participant
Posts: 158
Joined: Tue Mar 15, 2005 3:16 am

Post by saikrishna »

Hi

Transaction size is 0, Array size is 30000.
We have put nologging in the target table, to which we need to load. So , there is no rollback segment will be generated.

Even, We have tried, replacing target OCI stage with a sequential file stage and rant the job. but still we are getting similar problem. With this experiment, I thought that the problem is not there in database.

Let me know if you have any other solution

Thanks
Sai
saikir
Participant
Posts: 92
Joined: Wed Nov 08, 2006 12:25 am
Location: Minneapolis
Contact:

Post by saikir »

Hi,

We had experienced a very similar problem in our project some time ago. I remember changing the array size to zero and the job working fine. Try changing the array size zero?

Sai
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

You've now narrowed down the problem to your source stage and transform stage. You can now try a job which doesn't use an explicit transform stage and just go from source -> sequential file. Does the error persist? If you remove all the columns apart from the key and some small data column does the error persist?
saikrishna
Participant
Posts: 158
Joined: Tue Mar 15, 2005 3:16 am

Post by saikrishna »

The main hindrance to the experimentation now is...

Each run is taking around 2 hours to give the result... So, Advice is taken ...but I will see whether that time is available for now or not.. as I am in production, I have to load ...without delay...

Thanks
Sai
saikrishna
Participant
Posts: 158
Joined: Tue Mar 15, 2005 3:16 am

Post by saikrishna »

The job which loads from Oracle stage to direct sequential file stage is also aborted at around 13 million records.


Thanks
Sai
srinagesh
Participant
Posts: 125
Joined: Mon Jul 25, 2005 7:03 am

Post by srinagesh »

are there any restrictions on your UNIX server that an process cannot remain active for more than 1 hour ?

I may be wrong in my thinking, but I believe that the error you are getting is due to the UNIX process. It is getting killed.

Can you do ps-ef|grep phantom and search for the Job that is getting aborted and see if there are any processes ??
Post Reply