Teradata Stage and SIGSEGV error

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Help_DSEE
Participant
Posts: 2
Joined: Mon Oct 03, 2005 3:27 am

Teradata Stage and SIGSEGV error

Post by Help_DSEE »

Hi,

We are facing some issues while working with the Teradata Enterprise Stage. When ever we try to perform a complex SQL join in the query in the Teradata Enterprise stage, the job aborts giving a SIGSEGV segmentation error with a core dump or SIGILL. Same query on teradata query manager gives correct output. When two stages to read and a datastage join is used, it works fine. Initially we thought it was because of spool space issues at the database end and so splitting up the SQL joins were recomended to the team. But we are now facing the same issue when we are trying a simple select from the database. the table has around 2 million records and it is a direct select, no join involved. Even for this, a SIGSEGV error is raised and job is aborted. When we fire the same query on teradata query manager, it retuns the records.

Are there any settings on the datastage end that we have to take care of to overcome this. I have increased the APT_BUFFER_FREE_RUN to 0.8 from default 0.5 to increase the buffer availability and set the APT_TERA_64K_BUFFERS to true. It didnt help much. Jobs still abort with the same error.

Attached is the error on doing viewdata on the stage.


Regards,
Praveen

SIGSEGV:
------------
This step has 1 dataset:
ds0: {op0[3p] (parallel TD_MED_WIRELINE_CALL_HIST_CLEAN)
>>eCollectAny
op1[1p] (sequential APT_CombinedOperatorController:_Head)}

It has 2 operators:
op0[3p] {(parallel TD_MED_WIRELINE_CALL_HIST_CLEAN)
on nodes (
node1[op0,p0]
node2[op0,p1]
node1[op0,p2]
)}
op1[1p] {(sequential APT_CombinedOperatorController:
(_Head)
(_PEEK_IDENT_)
(_ABORT_IDENT_)
) on nodes (
node2[op1,p0]
)}
It runs 4 processes on 2 nodes.
Program "/bin/sh" terminated. [SIGSEGV] segmentation violation

SIGILL:
---------
##I TFSC 000000 13:10:30(006) <main_program>
This step has 1 dataset:
ds0: {op0[3p] (parallel TD_MED_WIRELINE_CALL_HIST_CLEAN)
>>eCollectAny
op1[1p] (sequential APT_CombinedOperatorController:_Head)}

It has 2 operators:
op0[3p] {(parallel TD_MED_WIRELINE_CALL_HIST_CLEAN)
on nodes (
node1[op0,p0]
node2[op0,p1]
node1[op0,p2]
)}
op1[1p] {(sequential APT_CombinedOperatorController:
(_Head)
(_PEEK_IDENT_)
(_ABORT_IDENT_)
) on nodes (
node2[op1,p0]
)}
It runs 4 processes on 2 nodes.
Program "/bin/sh" terminated. [SIGILL] illegal instruction
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Hi,
welcom aboard :D
Do a search on signal handler you may get some more picture on this.

regards
kumar
track_star
Participant
Posts: 60
Joined: Sat Jan 24, 2004 12:52 pm
Location: Mount Carmel, IL

Post by track_star »

What version of PX and TTU are you running? Also, how large is the query (number of characters)?
Post Reply