That specific error message is typically not the cause of the problem, but a symptom of job processes terminating; i.e. you should look elsewhere in the job and logs for the root cause of the problem.
As Arnd already noted, you will have to look through all the different error or warning messages. If you are not able to, then Try posting error and warning lines.
Priyadarshi Kunal
Genius may have its limitations, but stupidity is not thus handicapped.
Reading from the sequential file is not a problem because we are able to do a view data from the designer.
All the errors appear for the oracle enterprise stage.
There is no other message.
The only error messages are -
grh,0: Failure during execution of operator logic.
grh,0: Input 0 consumed 0 records.
grh,0: Fatal Error: waitForWriteSignal(): Premature EOF on node server.domain.com Socket operation on non-socket
(grh is the name of one oracle stage)
The above set of messages repeat for each oracle stage.
Then we get
node_node01: Player 1 terminated unexpectedly.
main_program: APT_PMsectionLeader(1, node01), player 1 - Unexpected exit status 1.
node_node01: Player 25 terminated unexpectedly.
main_program: APT_PMsectionLeader(1, node01), player 25 - Unexpected exit status 1.
followed by similar messages for player 2, player 3....on node 02..node03 etc ...
Only this job was running.
Please note that other jobs using oracle enterprise stage are working fine. So the problem is with this job in particular. Also, this is a working job from 8.1 version and being run for the first time in 11.3 where it is failing.
I recommend doing the following steps to simplify your error search:
1. Add $APT_CONFIG_FILE to your job with a 1-node configuration. This will reduce your processes and make debugging much easier.
2. After doing (1) and checking to make sure that the error is still present (and that you still cannot find details in the log), replace the Oracle output stage with a PEEK stage and re-run the job; the error will go away if the root cause is indeed in the Oracle stage.
3. What type of write is being done in the Oracle stage - insert, update, bulk load? Anything different from the other jobs which work?
4. Add a reject link to the stage and see if the job still aborts or just writes lots of rejects.
Created a copy of the job with just one oracle stage.
job failed.
Replaced the oracle stage with a sequential file stage.
Job failed.
Removed all stage derivations from the transformer. job was successful
Went back to the original job that had one oracle stage and Changed
the first stage variable derivation to just -1. So set the variable to -1
Job failed
Added APT_DISABLE_COMBINE=True Job was successful.
Seems like the problem is with the stage variables in the transformer. But this is a working job in 8.1 and it failing only in 11.3 which leads me to think that there is something wrong with our C compiler and the way the transformer has been compiled.
Now working with IBM techsupport to figure out what went wrong where.
What data type is the stage variable whose value you set to -1? If it's a string of some kind, any literal value needs to appear inside quotation marks, for example "-1".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
hsahay wrote:Added APT_DISABLE_COMBINE=True Job was successful.
I think it's actually APT_DISABLE_COMBINATION but...
This clue alone rings a bell.
I ran into this same behavior on 11.3 due to using a newer compiler than what was supported. Even the same compiler with newer PTF levels applied caused this problem too.
The compiler must be exactly the version specified on the system requirements page.