Page 1 of 1

XML Input Stage - Cannot Process data

Posted: Thu May 09, 2013 2:06 pm
by reachmexyz
Hello all,

I have a sample xml file like
<?xml version="1.0" encoding="utf-16"?>
<data>
<Customer FirstName="abc" Lastname="efg" >
<Address>
<Address1 State="CA" country="USA" >
</Address>
</Customer>
</data>

I am trying to process as
sequentialfile---->xmlinput---->seqfile

But the job always complain of structure error.
When i look carefully
<Address1> do not have a closing tag
Firstname, lastname, state and country also do not have a closing tag but these tags are part of Customer and Address which do have ending tags.

Can DataStage this king of file which is missing Ending tags.

Posted: Thu May 09, 2013 4:46 pm
by ray.wurlod
Why should it? No other XML parser can.

Demand well-formed XML from your provider.

Posted: Thu May 09, 2013 8:58 pm
by eostic
make sure you can open the xml in your browser. If you cannot, then it is not well formed.

Posted: Fri May 10, 2013 7:32 am
by reachmexyz
Actually i am able to open the xml in the browser and so it should be a well formatted xml file.

But i am not sure how to process these records with no end tags.

Posted: Fri May 10, 2013 7:41 am
by chulett
Why start by not post your "structure error"? That an confirm that you have accurately posted the sample XML in question.

Re: XML Input Stage - Cannot Process data

Posted: Fri May 10, 2013 8:08 am
by adityavinay
Use XML Stage(XML Parser) instead of XML input..
XML Input stage works only if you have open and close tags in one record.(it assumes to process record by record)
If your XML data is allover in a file.. XML parser is the stage that can parse the data for you.

Posted: Fri May 10, 2013 8:18 am
by eostic
both xmlInput and xml Stage work the same way regarding valid xml. Every tag must have a closing tag. Chances are, there is a typo in the thread up above and Address1 is actually

<Address1 stuff="xxx" morestuff="xxx" />

Without the '/', (as shown in the thread above) the xml is entirely invalid and cannot be read by any tool, let alone DataStage.

Assuming it is valid, read this in xmlInput by using:

/.../.../(whatever number of nodes you need to reach Address).../Address1/@stuff

/.../.../.../Address1/@morestuff


Ernie

Posted: Fri May 10, 2013 9:51 am
by reachmexyz
DataStage error is:

APT_CombinedOperatorController,0: Fatal Error: Fatal: XML input document parsing failed. Reason: Xalan fatal error (publicId: , systemId: , line: 1, column: 1): Invalid document structure
Xalan fatal error (publicId: , systemId: , line: 1, column: 1): Invalid document structure


Also the data posted above is correct. Address1 do not have an ending tag but i am able to open the file in browser.

My understanding is that, when an xml can be opened in a browser, it can be considered as a well formed XML and can be processed by any XML parser and even DataStage xml parser.

I will try formatting the file by adding closing tags and see how it goes..

Posted: Fri May 10, 2013 10:41 am
by eostic
I'd like to know what browser you're using....and then encourage you to change it. It needs to receive an error. It's entirely "non-well-formed" xml if <Address1 is left open. Same string in the thread above blows up in both my copies of IE and firefox, as it should.

Ernie

Posted: Fri May 10, 2013 10:43 am
by adityavinay
reachmexyz wrote:DataStage error is:

APT_CombinedOperatorController,0: Fatal Error: Fatal: XML input document parsing failed. Reason: Xalan fatal error (publicId: , systemId: , line: 1, column: 1): Invalid document structure
Xalan fatal error (publicId: , systemId: , line: 1, column: 1): Invalid document structure


Also the data posted above is correct. Address1 do not have an ending tag but i am able to open the file in browser.

My understanding is that, when an xml can be opened in a browser, it can be considered as a well formed XML and can be processed by any XML parser and even DataStage xml parser.

I will try formatting the file by adding closing tags and see how it goes..

<?xml version="1.0" encoding="utf-16"?>
<data>
<Customer FirstName="abc" Lastname="efg" >
<Address>
<Address1 State="CA" country="USA" >
</Address>
</Customer>
</data>
I feel like Address1 doesnt even have any open tag. If you observe it caefully.. "Address1 State="CA" country="USA"" is the data for Address tags. Customer have a closed tag but no open tag. Please fix it and give a try.