SIGSEGV error

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
sohasaid
Premium Member
Premium Member
Posts: 115
Joined: Tue May 20, 2008 3:02 am
Location: Cairo, Egypt

SIGSEGV error

Post by sohasaid »

Dears,

I've got the following error when job updates around 20,000 records at the destination:

Operator terminated abnormally: received signal SIGSEGV.

If any one knows the root causes of SIGSEGV error, kindly share them because it appears frequently and just disappear when reset the jobs without knowing why?

Data Source: Oracle RDB
Destination: Oracle 10g
Job Design:
ODBC Enterprise Stage --> Join --> Transformer --> Oracle Enterprise Stage

Regards.
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

there can be multiple problem which relates to this error. Let us know what you have tried so far. There are a lot of post related to this error, do a search to find more information.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
sohasaid
Premium Member
Premium Member
Posts: 115
Joined: Tue May 20, 2008 3:02 am
Location: Cairo, Egypt

Post by sohasaid »

Thanks for your reply.
I'm using the same job pattern for many jobs. Only few jobs introduced this error, so I don't know why SIGSEGV happened?

Regards.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Neither do we as many issues can cause this 'segmentation violation'. In my experience, with Oracle it can be a bug in their client code which only a very small number of jobs trigger. Check with your offiical support provider as I don't believe Oracle RDB is officially supported and there may be some 'known issues' with accessing it.
-craig

"You can never have too many knives" -- Logan Nine Fingers
sohasaid
Premium Member
Premium Member
Posts: 115
Joined: Tue May 20, 2008 3:02 am
Location: Cairo, Egypt

Post by sohasaid »

May be a new clue here. The job aborts while it transfers all data from source to destination Correctly? At Director, it generates SIGSEGV or "*** glibc detected *** free(): invalid next size (normal): 0x08395fe8 ***" errors.
At the second the run after reset, job runs normally without warnings or errors! I tried to maximize the Array size of source & destination stages but it didn't work.

Any Ideas?
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

if the free() command fails then you might have corrupted your memory pointers - perhaps by having run out of same. If you monitor physical and virtual memory usage during the job run, do you see a large amount of memory used or constant growth until out of memory?
sohasaid
Premium Member
Premium Member
Posts: 115
Joined: Tue May 20, 2008 3:02 am
Location: Cairo, Egypt

Post by sohasaid »

ArndW wrote:If you monitor physical and virtual memory usage during the job run, do you see a large amount of memory used or constant growth until out of memory?
How can I do that?
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

There are several tools available, all AIX systems have 'vmstat' available and optional tools such as 'topas' might be installed as well. It might be best to ask your system administrator for some help, since the output of the tools can be a bit cryptic upon first use.
sukrishnan
Participant
Posts: 2
Joined: Tue Mar 17, 2009 1:12 am
Location: Perth

*** glibc detected *** free()

Post by sukrishnan »

My job is as follows:
Seq File -> Transformer -> Oracle Connector -> Reject link to Transformer -> DRS stage

In the Oracle Connector:
If I set "Process warning messages as fatal error" as "NO" I get:
*** glibc detected *** free(): invalid next size (fast): 0x083ac588 ***

If this is set to "YES", I get:
*** glibc detected *** free(): invalid pointer: 0x083ac5a8 ***


These errors occur everytime a valid reject occurs, where we expect the rejected record to be written to a reject table; i.e. a primary key constraint in the target table has been violated. However, when run the first time (when the target is empty) there is no such error.


Any ideas/suggestions/help would be greatly appriciated.

Thanks
Post Reply