Parsing XML problems with Namespace
Posted: Wed Feb 04, 2009 9:07 pm
I'm getting a problem when I try to parse a XML file which contains namespace info. It also contains some tages xmi:nil=true for some fields.
The XML is (cut down a bit:
<?xml version="1.0" encoding="UTF-8" ?>
<PublishMXSERVITEM xmlns="http://www.ibm.com/maximo" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" baseLanguage="EN" creationDateTime="2009-02-04T16:34:04+11:00" event="0" maximoVersion="7 1 137 V7110-890" messageID="12337256452217794" transLanguage="EN">
<MXSERVITEMSet>
<SERVICEITEMS>
<CUSLOWERLIMIT xsi:nil="true" />
<CUSREQDETAILS>0</CUSREQDETAILS>
<CUSUPPERLIMIT xsi:nil="true" />
<STATUSDATE>2009-01-21T17:20:47+11:00</STATUSDATE>
</SERVICEITEMS>
</MXSERVITEMSet>
</PublishMXSERVITEM>
I hope that is readable. I've dropped a lot of the record, but this amount does work/fail as follows:
As is: no records output
Drop the namespace info: fails due to the xsi:nil="true"
from the xsi:nil="true" and keep namespace info: no records out
drop both namespace and xsi...: 1 record out.
It's a fairly bring folder -> XML -> transofrm -> Seq File
as I'm trying it as a proof of concept from a new feed for us.
The source insist they can't adjust the output as that is how the application produces it. Short of parsing and removing the offending data I've run out of ideas.
Can anyone suggest possible ways forward. More information is of course available if I've missed something pertinent.
The XML is (cut down a bit:
<?xml version="1.0" encoding="UTF-8" ?>
<PublishMXSERVITEM xmlns="http://www.ibm.com/maximo" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" baseLanguage="EN" creationDateTime="2009-02-04T16:34:04+11:00" event="0" maximoVersion="7 1 137 V7110-890" messageID="12337256452217794" transLanguage="EN">
<MXSERVITEMSet>
<SERVICEITEMS>
<CUSLOWERLIMIT xsi:nil="true" />
<CUSREQDETAILS>0</CUSREQDETAILS>
<CUSUPPERLIMIT xsi:nil="true" />
<STATUSDATE>2009-01-21T17:20:47+11:00</STATUSDATE>
</SERVICEITEMS>
</MXSERVITEMSet>
</PublishMXSERVITEM>
I hope that is readable. I've dropped a lot of the record, but this amount does work/fail as follows:
As is: no records output
Drop the namespace info: fails due to the xsi:nil="true"
from the xsi:nil="true" and keep namespace info: no records out
drop both namespace and xsi...: 1 record out.
It's a fairly bring folder -> XML -> transofrm -> Seq File
as I'm trying it as a proof of concept from a new feed for us.
The source insist they can't adjust the output as that is how the application produces it. Short of parsing and removing the offending data I've run out of ideas.
Can anyone suggest possible ways forward. More information is of course available if I've missed something pertinent.