Page 1 of 2

Join Throws error

Posted: Thu Mar 06, 2008 10:01 am
by r_arora
I am trying to do a join on two oracle tables Datastage 8.0.1 but the job aborts and gives me the following error:

Join_97,1: Failure during execution of operator logic.
Join_97,1: Input 0 consumed 9 records.
Join_97,1: Output 0 produced 9 records.
Join_97,1: Fatal Error: Pipe write failed: Broken pipe
Join_97,0: Failure during execution of operator logic.
Oracle_Enterprise_85,0: sendWriteSignal() failed on node SVKCDISD01 ds = 0 conspart = 1 Broken pipe
Oracle_Enterprise_85,0: Could not send close message (shared memory)
node_node1: Player 2 terminated unexpectedly.
node_node2: Player 2 terminated unexpectedly.
main_program: APT_PMsectionLeader(1, node1), player 2 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 2 - Unexpected exit status 1.
main_program: Step execution finished with status = FAILED.

But when I use a Lookup stage instead of a join stage My job runs successfully. My reference tables have lot of data, so using a lookup is not an option.
Can anyone help suggesting a possible reason as to why this is happening?

Thanks

Posted: Thu Mar 06, 2008 10:11 am
by ArndW
The first error is a write error in the Join stage, so that leads to the question of what stage is that? If you make a test copy of your job and output to a peek stage, does the error remain?

Posted: Thu Mar 06, 2008 10:29 am
by r_arora
yes..i tried using a peek stage instead of writting the result to a sequential file..but Im getting the same error

Re: Join Throws error

Posted: Thu Mar 06, 2008 10:59 am
by SURA
r_arora wrote:I am trying to do a join on two oracle tables Datastage 8.0.1 but the job aborts and gives me the following error:

Join_97,1: Failure during execution of operator logic.
Join_97,1: Input 0 consumed 9 records.
Join_97,1: Output 0 produced 9 records.
Join_97,1: Fatal Error: Pipe write failed: Broken pipe
Join_97,0: Failure during execution of operator logic.
Oracle_Enterprise_85,0: sendWriteSignal() failed on node SVKCDISD01 ds = 0 conspart = 1 Broken pipe
Oracle_Enterprise_85,0: Could not send close message (shared memory)
node_node1: Player 2 terminated unexpectedly.
node_node2: Player 2 terminated unexpectedly.
main_program: APT_PMsectionLeader(1, node1), player 2 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 2 - Unexpected exit status 1.
main_program: Step execution finished with status = FAILED.

But when I use a Lookup stage instead of a join stage My job runs successfully. My reference tables have lot of data, so using a lookup is not an option.
Can anyone help suggesting a possible reason as to why this is happening?

Thanks
Hi,

Let me know what else stage you used in that job. I fell it might be because of using the function in wrong manner.

Sura

Re: Join Throws error

Posted: Thu Mar 06, 2008 11:05 am
by r_arora
Im just joining two oracle stages..and writting the result in a sequential file directly. I have not used the transformer also. Just writting the results directly to a sequential file.

Posted: Thu Mar 06, 2008 11:14 am
by kumar_s
What is the average number of rows you are operating with?
If your target is Dataset, do you get the same error??
Are there several other jobs running your sever at the same time??

Posted: Thu Mar 06, 2008 12:20 pm
by Krazykoolrohit
Change the following parameters:

$APT_MONITOR_TIME = 5
$APT_MONITOR_SIZE = 100000

this may resolve your warning "Oracle_Enterprise_85,0: sendWriteSignal() failed on node SVKCDISD01 ds = 0 conspart = 1 Broken pipe "

Let me know if this works.

Posted: Fri Mar 07, 2008 3:29 am
by ArndW
KrazyKoolRohit - that is an interesting post. I'm unclear as to how those value could affect execution; could you perhaps explain that?

Posted: Fri Mar 07, 2008 11:07 am
by Krazykoolrohit
ArndW,

I faced the same issue yesterday and was browind DSX for the resolution. I found a lot of thread discussing this issue.

ex: viewtopic.php?t=116566&start=0&postdays ... NITOR_SIZE

I changed the parameters and got rid of the fatal warnings.. so thought it might help r_arora as well.

Posted: Fri Mar 07, 2008 11:14 am
by ArndW
KrazyKoolRohit - Thanks, I see now! I am not sure that the error causes are related, but they are similar and it is certainly worth a try to see if it makes a difference.

Posted: Wed Mar 12, 2008 10:11 am
by r_arora
I changed the values of APT_MONITOR_SIZE and APT_MONITOR_TIME but still getting a broken pipe error. It's strange that when I replace the join stage with a lookup stage my job runs successfully and loads the data to the sequential file.
Any help regarding this issue will be greatly appreciated.

Posted: Wed Mar 12, 2008 10:53 am
by ArndW
r_arora - the APT_... value changes were a long shot but worth trying.
What are the data volumes for the two links and could you tell us what you are doing in the join?

Posted: Wed Mar 12, 2008 10:58 am
by r_arora
I am doing an inner join between two oracle tables having around 20 lakh records. Actually one of the sources have about 90 million records but I have used a User Defined SQL there which returns about 20 lakh records passed to the join stage.

Posted: Wed Mar 12, 2008 11:11 am
by r_arora
just to clarify 20 lakh is 2 million records

Posted: Wed Mar 12, 2008 11:15 am
by ArndW
Can you monitor your scratch space while running this job? Also, if you halve the number of rows does the job run through (i.e. determine if your error is caused by data volumes)?