XML Input Stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
harshada
Premium Member
Premium Member
Posts: 92
Joined: Tue May 29, 2007 8:40 am

XML Input Stage

Post by harshada »

job : <seq file> -->> <XML Input> -->> <ora table>

im facing problems in parsing a XML document. below is a sample xml document:
<sdnEntry>
<uid>737</uid>
<lastName>EMPRESA CUBANA DE AVIACION</lastName>
<sdnType>Entity</sdnType>
<programList>
<program>CUBA</program>
</programList>
</sdnEntry>

the error I get is ''Xalan fatal error (publicId: , systemId: , line: 1, column: 1): Invalid document structure''
and there are zero number of rows output.

but if i remove the newline characters in the xml document ( like) :
<sdnEntry><uid>737</uid><lastName>EMPRESA CUBANA DEVIACION</lastName><sdnType>Entity</sdnType><programList><program>CUBA</program></programList></sdnEntry>

then the document is successfully parsed and i get the respective rows in the output.
Im suspecting it has to do with some settings in the Format tab page of the sequential file... any help in this regard would be helpful.
Also can anyone tell me how to use a schema file to validate the input XML document ???

many thanks
harshada
VCInDSX
Premium Member
Premium Member
Posts: 223
Joined: Fri Apr 13, 2007 10:02 am
Location: US

Post by VCInDSX »

I am not sure about Issue# 1 and you might want to play with the "Format" setting of the file stage.

As for validating the input document against an XSD, here is what you should do.
1. Enable the "Validate input XML" checkbox in the "General" page of the "Stage" properties.
2. Setup the Schema Validation Level to "Strict".
3. If you have too many documents of this type coming in, enable the caching as well.
4. In the "Transformation Settings" page, enable the namespace declaration" checkbox and "Load" the imported xml table definition. This will bring in the namespaces you have in the XSD. This should like like below

Code: Select all

xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
xmlns="http://myschema.defaults.com/sdnSchemas/2007-22-05/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
5. Add the schemaLocation property to this list. This section will look like below

Code: Select all

xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
xmlns="http://myschema.defaults.com/sdnSchemas/2007-22-05/"

xsi:schemaLocation="http://myschema.defaults.com/sdnSchemas/2007-22-05/
/sdnData/schemas/sdnSchema.xsd"

xmlns:xsd="http://www.w3.org/2001/XMLSchema"
6. You might want to try the different setting of "Transformation Error Mappings" in the "General" page if you would like to convert XML stage errors to Datastage errors.

Good luck,
-V
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Perhaps issue #1 does have something to do with the Seq Stage... as a test, at least, you should try sending the full path (URL) to the XMLInput Stage instead. You might even like this solution, as it allows you to be very creative about how you get the file name(s) of your XML docs, and also frees you up from some ultimate size limitations when going directly from another stage with the xml content in a single column. At least it will help you determine if it's the stage, or the file, or something else.

Ernie
Post Reply