Xml Validation in XML Input Stage against a given schema

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
harshada
Premium Member
Premium Member
Posts: 92
Joined: Tue May 29, 2007 8:40 am

Xml Validation in XML Input Stage against a given schema

Post by harshada »

I have an XML input file to be parsed into a flat file/DBMS.
I need to validate the input XML file against a XSD schema file.

XML Input Stage Details
---------------
Include namespace declarations - Checked
Validate Schema - Strict
Namespace declaration:
xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:defns="http://tempuri.org/sdnList.xsd"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"


It does basic schema validations like if the closing tag is missing then an error is thrown.
My question is does it validate against a schema file provided? ie check for the data types , 'maxoccurs' , 'minoccurs' etc
Example :
Schema file : <xs:element name="uid" type="xs:int" />
XML File : <uid>737</uid>

If I change the value of uid to <uid>abc </uid> it does not validate or throw an error.

can anyone please let me know if such a case can be handled in DataStage parallel or server job ? and if yes then how.

Thanks
harshada
VCInDSX
Premium Member
Premium Member
Posts: 223
Joined: Fri Apr 13, 2007 10:02 am
Location: US

Re: Xml Validation in XML Input Stage against a given schema

Post by VCInDSX »

Hi harshada,
Did you try adding the schemaLocation to the header?
xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:defns="http://tempuri.org/sdnList.xsd"

xsi:schemaLocation="Yourschema.xsd"

xmlns:xsd="http://www.w3.org/2001/XMLSchema"


Good luck,
-V
harshada
Premium Member
Premium Member
Posts: 92
Joined: Tue May 29, 2007 8:40 am

Post by harshada »

Hi,
Yes I tried xsi:schemaLocation="Yourschema.xsd" in the namespace declaration. Still the validation is not happening. Could you plz tell me when Validate schema is checked; what kind of validations are performed? For e.g. if a new tag is added in the input XML file, which is not present in the XSD schema, the job should abort or throw some warning. This is not happening.

Thanks ..
Harshada
VCInDSX
Premium Member
Premium Member
Posts: 223
Joined: Fri Apr 13, 2007 10:02 am
Location: US

Post by VCInDSX »

Also in the "Validation Settings" tab make sure you enable "Log Reject errors"
and in the "Validation Error Mappings", select "Reject" for "Fatal" and "Error" types.

This sends the error messages to the log.

Please download and refer the "XML Best Practices document" that Duke has on his website http://www.duke-consulting.com/Download ... ctices.zip. Thanks again Duke for the good work.

HTH,
-V
harshada
Premium Member
Premium Member
Posts: 92
Joined: Tue May 29, 2007 8:40 am

Post by harshada »

Thanks V for the help
1)In "Validation Settings" tab ' log reject errors' are checked
2) we tried all possible combinations of Fatal Error Warning setting , not working. We have a defined a reject link from the XML Input Stage , no supposed rejects are captured here.
3) we did go through the XML Best Practises document , we imported some of the *.dsx file checked the jobs. But in those sample jobs there is no 'schemalocation' mentioned neither is 'Validate Input XMl' checked on the General tab.

Thanks

Harshada
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I've never had a need to use that option. Open a case with your official Support provider, let us know what they say.
-craig

"You can never have too many knives" -- Logan Nine Fingers
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

I have used it many times, and in fact have tests that check just about every possible schema consideration --- datatypes, min occurs/max occurs, enumerated values, etc. It's in an old .dsx so let me load it up and check, but I used it not that long ago and will send the job and some samples out once I find them....

One interesting observation --- I know that the tests I have are in Server. I don't expect it should matter, but just for kicks, try your schema and xml document validation test in Server in the meantime and see if there's an issue there.

Ernie
harshada
Premium Member
Premium Member
Posts: 92
Joined: Tue May 29, 2007 8:40 am

Post by harshada »

Hi,
Thanks a lot Ernie.. that would be really helpful. I will surely try the server job. In the meanwhile i would like to ask sumthing else too. My sample XML file is as follows:
<?xml version="1.0" standalone="yes"?>
<sdnList xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://tempuri.org/sdnList.xsd">
<publshInformation>
<Publish_Date>05/17/2007</Publish_Date>
<Record_Count>3453</Record_Count>
</publshInformation>
<sdnEntry>
<uid>aaa</uid>
<lastName>EMPRESA CUBANA DE AVIACION</lastName>
<sdnType>Entity</sdnType>
<programList>
<program>CUBA</program>
</programList>
<akaList>
<aka>
<uid>aaa</uid>
<type>a.k.a.</type>
<category>strong</category>
<lastName>CUBANA AIRLINES</lastName>
</aka>
</akaList>
<addressList>
<address>
<uid>aaa</uid>
<uid>aaa</uid>
<address1>Belas Airport</address1>
<city>Luanda</city>
<country>Angola</country>
</address>
</addressList>
</sdnEntry>
</sdnList>

When I run the job to read this XML file, without even validate schema being checked, and all the 3 transformation error mappings are made fatal, no records are read i.e. 0 records read, no error in the log.
I am reading the file from a URL/file path specified.

If the second line in the input sample file is made as '<sdnList>' only then all records are read successfully. Is there anything wrong with the input file. (although its sure this file is correct). Can you point out sumthing in this ?

Following are the options checked in the XML input PX stage:

XML source--> Column Content --> URL/FILE Path
Validate input XML --> not checked
include namespace declaration checked and loaded.
Transformation Error Mappings --> all 3 fatal

Thanks again.
Harshada
Post Reply