I'm having a problem getting the XML Input stage to work. I have my input source as a sequential file (the actual XML doc) reading each line as a variable length string i.e with the delimiter set to "000". Each line is sent to the XML Input stage with 'Column content' set to "XML document". The output columns for the XML Input stage are loaded from a table definition I defined using the XML Meta Data Importer with my XML doc DTD.
The problem I have is with the input - I don't know how to get the XML input stage to understand the lines of input sent from the sequential file source and therefore the XML parsing fails.
All warning messages refer to line 1?? e.g. "Equity_Index..XML_Input_22: XML input document parsing failed. Reason: Xalan fatal error (publicId: , systemId: , line: 1, column: 23): Invalid document structure"
"Equity_Index..XML_Input_22: XML input document parsing failed. Reason: Xalan fatal error (publicId: , systemId: , line: 1, column: 98): Invalid document structure"
etc..
Can anyone help me with this?
Thanks
XML Input stage
Moderators: chulett, rschirm, roy
Re: XML Input stage
Not the expert on XMLInput but if I remember correctly you don't read an XML input file with a Sequential Stage :D . You use a FolderStage to point to the XML file and connect that to XMLInput.
"There is one crucial design requirement of XML Input - you need to pass it an input link contain a URL or an file path or an XML document".
If you installed a 7.5 client you should have the documentation on the XML stages on your PC.
Ogmios
"There is one crucial design requirement of XML Input - you need to pass it an input link contain a URL or an file path or an XML document".
If you installed a 7.5 client you should have the documentation on the XML stages on your PC.
Ogmios
In theory there's no difference between theory and practice. In practice there is.
-
- Premium Member
- Posts: 108
- Joined: Sat Feb 05, 2005 6:52 pm
- Location: US
If you have a lot of XML data to parse and are breaking it down into several streams you might want to consider going outside of DataStage to process the XML into 1 or more flat files and deal with them that way.
I changed a job stream that ran for over 4 hours processing 45 streams out of 3 largish (few hundred MB) XML's into a 20 minute stream by using an external XSLT parser. the flat files were then processed.
My understanding is that the XML addon for DS is really for trickle fed real time data, from MQseries etc, not for humungous bulk files coming through.
Consider looking beyond the sand pit for solutions, sometimes you'll be suprised
I changed a job stream that ran for over 4 hours processing 45 streams out of 3 largish (few hundred MB) XML's into a 20 minute stream by using an external XSLT parser. the flat files were then processed.
My understanding is that the XML addon for DS is really for trickle fed real time data, from MQseries etc, not for humungous bulk files coming through.
Consider looking beyond the sand pit for solutions, sometimes you'll be suprised
Andrew
Think outside the Datastage you work in.
There is no True Way, but there are true ways.
Think outside the Datastage you work in.
There is no True Way, but there are true ways.
I've used two parsers;
Saxxon and Xalan/Xerces. Xalan/Xerces is on apache.org, Saxxon you'll have to search for.
I think Xalan and Xerces came orginally from IBM and were open sourced to Apache control a few years ago.
There are others available, but I like the use OS software.
You'll need to read up on XSLT to create the scripts. i mioght be able to find a copy of the XSLT I used to create CSV files, but I don't have it on hand at the moment.
Saxxon and Xalan/Xerces. Xalan/Xerces is on apache.org, Saxxon you'll have to search for.
I think Xalan and Xerces came orginally from IBM and were open sourced to Apache control a few years ago.
There are others available, but I like the use OS software.
You'll need to read up on XSLT to create the scripts. i mioght be able to find a copy of the XSLT I used to create CSV files, but I don't have it on hand at the moment.
Andrew
Think outside the Datastage you work in.
There is no True Way, but there are true ways.
Think outside the Datastage you work in.
There is no True Way, but there are true ways.