XML validation on XML input stage result in sigsegv

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
niremy
Participant
Posts: 23
Joined: Tue Sep 22, 2009 3:17 am

XML validation on XML input stage result in sigsegv

Post by niremy »

Hello,

I'm trying to make a job with an XML Input stage to transform some data from an xml file to a sequential file. I managed to make the job with a valid xml file thanks to Kim Duke and his XML best practice.

Now I want to validate the input XML in order to verify if the input xml has the correct structure. So on the general tab on the Input XML stage, I checked the "Validate input XML".
When I run the job against a valid XML I have no problem. When I run it against an invalid XML the job aborts with a nice SIGSEGV.

I've checked the dsxchange forum but nobody seems to have been facing this problem. Can someone guide me to correct this issue ?
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Check a couple of things....first, how is the XML Schema identified in the document? There should be an attribute in the header called something like schemalocation or sometimes NoNameSpaceSchemaLocation. This points to the xsd. To be safe, do your testing with this pointing to a 100% local and fully qualified XSD.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
niremy
Participant
Posts: 23
Joined: Tue Sep 22, 2009 3:17 am

Post by niremy »

Thanks for the quick reply.

As you pointed the XML schema was simply pointing to the local file schema.xsd . I've replaced schema.xsd with the full path to the schema :

Code: Select all

xsi:schemaLocation="http://www.example.org/schema /home/isadmin/schema.xsd ">
Unfortunately, the validation still show no errors with the correct XML file and produce a sigsegv as soon as I introduce a schema violation (I tried changing an integer value in my correct file into a string)

I also intentionnaly used a wrong path for the schema file. This time I received an information in the reject link of my XML Input stage with the following message :
XML input document parsing failed. Reason: Xalan warning (publicId: , systemId: , line: 0, column: 0): An exception occurred! Type:RuntimeException, Message:Warning: The primary document entity could not be opened. Id=/TotallyBadPath/schema.xsd
What should I check next ?
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Support. There have been a variety of patches to validation. I haven't heard of one blowing up exactly this, but it's certainly possible, and you've narrowed things down. I will assume it works perfect when you uncheck the box.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
niremy
Participant
Posts: 23
Joined: Tue Sep 22, 2009 3:17 am

Post by niremy »

Thanks again.

I may have an interesting additionnal information. As a colleague pointed out, I had a reject output into which I assumed the validation errors should be sent.
I removed this reject output and the xml validation sent a message to the log with the expected validation error alongside a bunch of other errors like this ones:

Code: Select all

Peek_35,0: Fatal Error: waitForWriteSignal(): Premature EOF on node XXX Socket operation on non-socket

node_node1: Player 1 terminated unexpectedly.

main_program: APT_PMsectionLeader(1, node1), player 1 - Unexpected exit status 1.
(I used a peek for testing purpose)

If you don't have anymore clue, I think I will turn towards the support.
(I have quickly scanned the patch for the 8.1 release and have found no information about XML validation problems)
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Hard to say then...indeed the validation should go down the reject path, and you can indicate what kinds of errors to be captured and what should happen. You might want to play with the various options (warning, fatal, etc.) and the column datatype for the column on that particular link that you are using. I usually have just one column on the reject link, along with another column that uses passthru for the content or the url that arrived on the input link. (just have it be the same column name as on your input link).

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
niremy
Participant
Posts: 23
Joined: Tue Sep 22, 2009 3:17 am

Post by niremy »

Back in the business.

I'm pleased to see that I didn't took the wrong path using the xml validation as you described the exact job that I designed.

Today I tried the various option you describe. I even used Decimal for the Message format, but unfortunately I didn't receive a single warning for type mismatch ... the SIGSEGV fault occurs before.

So I think I have a single option left : contact the support and/or my technical team.

I'll get back when I have a solution (if I found one) or a workaround (I think of external java schema validation but it can be cumbersome to get the validation error messages)
niremy
Participant
Posts: 23
Joined: Tue Sep 22, 2009 3:17 am

Post by niremy »

Hello,

The SIGSEGV error has been corrected with the patch JR34558.
For your information the patch is a portage of the patch JR33356 for IS 8.1.

I don't have time to fully test the validation functionality, I hope there's no more problems
Post Reply