Join Throws error

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

r_arora
Participant
Posts: 20
Joined: Tue Mar 04, 2008 10:30 am

Join Throws error

Post by r_arora »

I am trying to do a join on two oracle tables Datastage 8.0.1 but the job aborts and gives me the following error:

Join_97,1: Failure during execution of operator logic.
Join_97,1: Input 0 consumed 9 records.
Join_97,1: Output 0 produced 9 records.
Join_97,1: Fatal Error: Pipe write failed: Broken pipe
Join_97,0: Failure during execution of operator logic.
Oracle_Enterprise_85,0: sendWriteSignal() failed on node SVKCDISD01 ds = 0 conspart = 1 Broken pipe
Oracle_Enterprise_85,0: Could not send close message (shared memory)
node_node1: Player 2 terminated unexpectedly.
node_node2: Player 2 terminated unexpectedly.
main_program: APT_PMsectionLeader(1, node1), player 2 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 2 - Unexpected exit status 1.
main_program: Step execution finished with status = FAILED.

But when I use a Lookup stage instead of a join stage My job runs successfully. My reference tables have lot of data, so using a lookup is not an option.
Can anyone help suggesting a possible reason as to why this is happening?

Thanks
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

The first error is a write error in the Join stage, so that leads to the question of what stage is that? If you make a test copy of your job and output to a peek stage, does the error remain?
r_arora
Participant
Posts: 20
Joined: Tue Mar 04, 2008 10:30 am

Post by r_arora »

yes..i tried using a peek stage instead of writting the result to a sequential file..but Im getting the same error
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Re: Join Throws error

Post by SURA »

r_arora wrote:I am trying to do a join on two oracle tables Datastage 8.0.1 but the job aborts and gives me the following error:

Join_97,1: Failure during execution of operator logic.
Join_97,1: Input 0 consumed 9 records.
Join_97,1: Output 0 produced 9 records.
Join_97,1: Fatal Error: Pipe write failed: Broken pipe
Join_97,0: Failure during execution of operator logic.
Oracle_Enterprise_85,0: sendWriteSignal() failed on node SVKCDISD01 ds = 0 conspart = 1 Broken pipe
Oracle_Enterprise_85,0: Could not send close message (shared memory)
node_node1: Player 2 terminated unexpectedly.
node_node2: Player 2 terminated unexpectedly.
main_program: APT_PMsectionLeader(1, node1), player 2 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 2 - Unexpected exit status 1.
main_program: Step execution finished with status = FAILED.

But when I use a Lookup stage instead of a join stage My job runs successfully. My reference tables have lot of data, so using a lookup is not an option.
Can anyone help suggesting a possible reason as to why this is happening?

Thanks
Hi,

Let me know what else stage you used in that job. I fell it might be because of using the function in wrong manner.

Sura
r_arora
Participant
Posts: 20
Joined: Tue Mar 04, 2008 10:30 am

Re: Join Throws error

Post by r_arora »

Im just joining two oracle stages..and writting the result in a sequential file directly. I have not used the transformer also. Just writting the results directly to a sequential file.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

What is the average number of rows you are operating with?
If your target is Dataset, do you get the same error??
Are there several other jobs running your sever at the same time??
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Krazykoolrohit
Charter Member
Charter Member
Posts: 560
Joined: Wed Jul 13, 2005 5:36 am
Location: Ohio

Post by Krazykoolrohit »

Change the following parameters:

$APT_MONITOR_TIME = 5
$APT_MONITOR_SIZE = 100000

this may resolve your warning "Oracle_Enterprise_85,0: sendWriteSignal() failed on node SVKCDISD01 ds = 0 conspart = 1 Broken pipe "

Let me know if this works.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

KrazyKoolRohit - that is an interesting post. I'm unclear as to how those value could affect execution; could you perhaps explain that?
Krazykoolrohit
Charter Member
Charter Member
Posts: 560
Joined: Wed Jul 13, 2005 5:36 am
Location: Ohio

Post by Krazykoolrohit »

ArndW,

I faced the same issue yesterday and was browind DSX for the resolution. I found a lot of thread discussing this issue.

ex: viewtopic.php?t=116566&start=0&postdays ... NITOR_SIZE

I changed the parameters and got rid of the fatal warnings.. so thought it might help r_arora as well.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

KrazyKoolRohit - Thanks, I see now! I am not sure that the error causes are related, but they are similar and it is certainly worth a try to see if it makes a difference.
r_arora
Participant
Posts: 20
Joined: Tue Mar 04, 2008 10:30 am

Post by r_arora »

I changed the values of APT_MONITOR_SIZE and APT_MONITOR_TIME but still getting a broken pipe error. It's strange that when I replace the join stage with a lookup stage my job runs successfully and loads the data to the sequential file.
Any help regarding this issue will be greatly appreciated.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

r_arora - the APT_... value changes were a long shot but worth trying.
What are the data volumes for the two links and could you tell us what you are doing in the join?
r_arora
Participant
Posts: 20
Joined: Tue Mar 04, 2008 10:30 am

Post by r_arora »

I am doing an inner join between two oracle tables having around 20 lakh records. Actually one of the sources have about 90 million records but I have used a User Defined SQL there which returns about 20 lakh records passed to the join stage.
r_arora
Participant
Posts: 20
Joined: Tue Mar 04, 2008 10:30 am

Post by r_arora »

just to clarify 20 lakh is 2 million records
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Can you monitor your scratch space while running this job? Also, if you halve the number of rows does the job run through (i.e. determine if your error is caused by data volumes)?
Post Reply