Hierarchical Stage XML transformation aborts
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 152
- Joined: Tue Jan 13, 2009 8:59 am
Hierarchical Stage XML transformation aborts
We have a job that creates Json file using the hierarchical stage for XML transformation. The stage is connected from 3 sequential stages and the XML stage assembly does Restructure and Hjoins. The job is failing after reading 13.5 M records. Is there a setting or environment variable for the stage we need to tweak to process larger volumes? The job needs to process 188 million records. The log is not of much help(rerunning the job with disabling the operator combination), it just fails after partial read of the big sequential file.
I appreciate any input on this.
Failure during execution of operator logic.
Output 0 produced 13573742 records.
node_node2: Player 1 terminated unexpectedly.
main_program: APT_PMsectionLeader(2, node2), player 1 - Unexpected termination by Unix signal 9 (SIGKILL).
SequentialStage---------|
SequentialStage---->XMLTransformation---->OuputFile
SequentialStage---------|
Thanks.
[Note: topic title changed to correctly reflect hierarchical stage usage - Andy]
I appreciate any input on this.
Failure during execution of operator logic.
Output 0 produced 13573742 records.
node_node2: Player 1 terminated unexpectedly.
main_program: APT_PMsectionLeader(2, node2), player 1 - Unexpected termination by Unix signal 9 (SIGKILL).
SequentialStage---------|
SequentialStage---->XMLTransformation---->OuputFile
SequentialStage---------|
Thanks.
[Note: topic title changed to correctly reflect hierarchical stage usage - Andy]
Do any of the latter error messages have java errors in them that refer to stack or heap size issues? If so, you need to increase those settings for the stage (defaults to 256 MB for each).
Also - assuming that you don't need all 10+ million records at once, you can use the Split data into batches option to chop it up into smaller sections for processing. Make sure the data is sorted on a key (even if its an artificial one you had to construct) and you can specify that as the batch split key. This is sort of a "wave-equivalent" that tells the stage to process the data in smaller chunks.
Also - assuming that you don't need all 10+ million records at once, you can use the Split data into batches option to chop it up into smaller sections for processing. Make sure the data is sorted on a key (even if its an artificial one you had to construct) and you can specify that as the batch split key. This is sort of a "wave-equivalent" that tells the stage to process the data in smaller chunks.
-
- Participant
- Posts: 152
- Joined: Tue Jan 13, 2009 8:59 am
You would have to look at the latter error messages in the log. This is one of the few cases in DataStage where the first message does not inform you as to the actual cause of the abort.
Look down in the log, see if any of the messages have java errors in them (like "Java runtime exception occurred: java.lang.OutOfMemoryError").
If they do, post the contents here, we'll look at them.
Look down in the log, see if any of the messages have java errors in them (like "Java runtime exception occurred: java.lang.OutOfMemoryError").
If they do, post the contents here, we'll look at them.
-
- Participant
- Posts: 152
- Joined: Tue Jan 13, 2009 8:59 am
-
- Participant
- Posts: 152
- Joined: Tue Jan 13, 2009 8:59 am
I looked down the log and these are additional ones I see.
node_node2: Player 1 terminated unexpectedly.
seq_pfsComp,0: Failure during execution of operator logic.
seq_pfsComp,0: Output 0 produced 9581094 records.
seq_pfsComp,0: Fatal Error: Unable to allocate communication resources
main_program: APT_PMsectionLeader(2, node2), player 1 - Unexpected termination by Unix signal 9(SIGKILL).
seq_JSON_Extract,0: Failure during execution of operator logic.
seq_JSON_Extract,0: Input 0 consumed 0 records.
seq_JSON_Extract,0: Fatal Error: waitForWriteSignal(): Premature EOF on node apsrd3247 Socket operation on non-socket
node_node1: Player 1 terminated unexpectedly.
main_program: APT_PMsectionLeader(1, node1), player 1 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 2 - Unexpected exit status 1.
main_program: Step execution finished with status = FAILED.
node_node2: Player 1 terminated unexpectedly.
seq_pfsComp,0: Failure during execution of operator logic.
seq_pfsComp,0: Output 0 produced 9581094 records.
seq_pfsComp,0: Fatal Error: Unable to allocate communication resources
main_program: APT_PMsectionLeader(2, node2), player 1 - Unexpected termination by Unix signal 9(SIGKILL).
seq_JSON_Extract,0: Failure during execution of operator logic.
seq_JSON_Extract,0: Input 0 consumed 0 records.
seq_JSON_Extract,0: Fatal Error: waitForWriteSignal(): Premature EOF on node apsrd3247 Socket operation on non-socket
node_node1: Player 1 terminated unexpectedly.
main_program: APT_PMsectionLeader(1, node1), player 1 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 2 - Unexpected exit status 1.
main_program: Step execution finished with status = FAILED.
-
- Premium Member
- Posts: 425
- Joined: Sat Nov 19, 2005 9:26 am
- Location: New York City
- Contact:
If you process small files the job finished OK? Ensure that the process works with smaller files then increase the sizes up to the point where it fail before starting the tuning exercise
I took similar approach while parsing a huge XML file and couldn't tune enough the setting to digest the file...
Could part of the restructure be done outside of the hierarchical stage? Could you join outside too? That should remediate the workload on that part of your process
Regards
I took similar approach while parsing a huge XML file and couldn't tune enough the setting to digest the file...
Could part of the restructure be done outside of the hierarchical stage? Could you join outside too? That should remediate the workload on that part of your process
Regards
Julio Rodriguez
ETL Developer by choice
"Sure we have lots of reasons for being rude - But no excuses
ETL Developer by choice
"Sure we have lots of reasons for being rude - But no excuses
-
- Participant
- Posts: 152
- Joined: Tue Jan 13, 2009 8:59 am
Thanks for your Input. yes the job runs fine till 13.5 M records and it fails after that. even if we bump up the heap it didn't help. We are raising a ticket to IBM and see if they can help.
Speaking of doing the restructure and Hjoins outside of that stage, taking a alternate approach to avoid using the stage and see if we can build the logic in transformer.
Speaking of doing the restructure and Hjoins outside of that stage, taking a alternate approach to avoid using the stage and see if we can build the logic in transformer.
-
- Participant
- Posts: 152
- Joined: Tue Jan 13, 2009 8:59 am
-
- Premium Member
- Posts: 425
- Joined: Sat Nov 19, 2005 9:26 am
- Location: New York City
- Contact:
See if this technote help
http://www-01.ibm.com/support/docview.w ... wg21503212
http://www-01.ibm.com/support/docview.w ... wg21503212
Julio Rodriguez
ETL Developer by choice
"Sure we have lots of reasons for being rude - But no excuses
ETL Developer by choice
"Sure we have lots of reasons for being rude - But no excuses
When processing several million line items I hit a limit also.
Eventually found a set of optional java arguments that worked.
The final configuration for the hierarchical stage was:
Usage/Java/Heap Size (MB): 1024
Usage/Java/Stack Size (KB): 2048
Usage/Java/Optional Arguments: -Xjit:dontInline={com/ibm/xml/xlxp/api/util/SimplePositionHelper.getCurrentPosition10*,com/ibm/xml/xlxp/api/util/DataBufferHelper.computeCoords10*},{com/ibm/xml/xlxp/api/util/SimplePositionHelper.getCurrentPosition10*}(disableGLU),{com/ibm/xml/xlxp/api/util/DataBufferHelper.computeCoords10*}(disableGLU)
Usage/Scratch Disk: Yes
Eventually found a set of optional java arguments that worked.
The final configuration for the hierarchical stage was:
Usage/Java/Heap Size (MB): 1024
Usage/Java/Stack Size (KB): 2048
Usage/Java/Optional Arguments: -Xjit:dontInline={com/ibm/xml/xlxp/api/util/SimplePositionHelper.getCurrentPosition10*,com/ibm/xml/xlxp/api/util/DataBufferHelper.computeCoords10*},{com/ibm/xml/xlxp/api/util/SimplePositionHelper.getCurrentPosition10*}(disableGLU),{com/ibm/xml/xlxp/api/util/DataBufferHelper.computeCoords10*}(disableGLU)
Usage/Scratch Disk: Yes