XML single tag

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
kennyapril
Participant
Posts: 248
Joined: Fri Jul 30, 2010 9:04 am

XML single tag

Post by kennyapril »

I am parsing an XML document using XML input stage but only open and close tags can be parsed into records and single tags appear as tags for the null records.how do I see null in place of <Faxno/>.....

Please let me know if any changes need to be applied
Regards,
Kenny
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Put /text() at the end of the string in the Description property.
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
kennyapril
Participant
Posts: 248
Joined: Fri Jul 30, 2010 9:04 am

Post by kennyapril »

I used /text() in the description and it worked fine when used external source as input giving the file location but when I replaced the external source stage with original file using sequential file changing the property from file path to XML file in XML input stage... the records are not getting parsed.

I see below warnings

XML input document parsing failed. Reason: Xalan fatal error (publicId: , systemId: , line: 1, column: 513): Invalid character (Unicode: 0x0)


XML input document parsing failed. Reason: Xalan fatal error (publicId: , systemId: , line: 1, column: 3): There are more end tags than start tags

XML input document parsing failed. Reason: Xalan fatal error (publicId: , systemId: , line: 1, column: 1): Invalid document structure


also FYI when opened the file which I gave as source. it opens in an Internet explorer as XML.

Please suggest me if I need to do any changes
Regards,
Kenny
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

You have the answer. You should only be using external Source Stage and pass as url (radio button in the stage) when reading xml documents from disk.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
kennyapril
Participant
Posts: 248
Joined: Fri Jul 30, 2010 9:04 am

Post by kennyapril »

But in my situation I cannot use only external source as after parsing this XML. I need to read the CLOB field directly from a table.

Now after increasing the APT_MAX_DELIMITED_SIZE and using the sequential file as input I got rid of all the warnings and I can view the XML also in Sequential file but when parsing I see only one warning now..

XML_Input,0: Warning: CLOB.XML_Input: XML input document parsing failed. Reason: Xalan fatal error (publicId: , systemId: , line: 348, column: 31): Expected an element name

Please suggest me if I can do anything to get rid of this warning and get the parsed records.
Regards,
Kenny
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

If it works with External Source and not with the Sequential Stage, then your xml Stage is probably just fine....the Sequential Stage is trashing the xml.....because there are likely some special characters, stray CRLFs, or other things that are "noise" for xml, but might lead the Sequential Stage astray.

If you are going to be reading from a database, great! Use that for your testing. Put the documents into a database and then use a proper Connector as your source Stage.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
kennyapril
Participant
Posts: 248
Joined: Fri Jul 30, 2010 9:04 am

Post by kennyapril »

I will try using table as input. thank you very much for the input.
Regards,
Kenny
kennyapril
Participant
Posts: 248
Joined: Fri Jul 30, 2010 9:04 am

Post by kennyapril »

Yes Ernie, as you said the issue is with the data in the columns when used a sequential file. Now I used the tables directly as source and it works fine.

Thanks again for all the information:)
Regards,
Kenny
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Great! The reason is that the sequential stage is looking for consistent records -- which it usually has. XML, on the other hand, is anything but consistent....odd characters, CRLFs, LFs, CR's, etc. can appear anywhere in the document --- xml parsers ignore all of that..... but the sequential stage isn't easily configured to handle such things, and may unexpectedly parse your xml into more than one row --- thus killing the downstream xml Stage that expects a "whole" document.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Post Reply