white spaces in xml
Moderators: chulett, rschirm, roy
white spaces in xml
Hi ,
I am getting following error in xml input stage:
The job design is
folder stage....> xml input stage...> seq file stage.
[b]XML input document parsing failed. Reason: Xalan error (publicId: ,
systemId: , line: 826, column: 22): Datatype error:
Type:InvalidDatatypeValueException, Message:Value '8' does not match
any member types (of the union)[/b]
Line 826 is <E19_04> 8</E19_04> node which is giving problem.
The schema details for this node is as follows:
...................................................................
<xs:element xmlns:xs="http://www.w3.org/2001/XMLSchema " minOccurs="0" name="E19_04" type=" SizeOfProcedureEquipment">
<xs:annotation >
<xs:documentation>The size of the equipment used in the procedure on the patient </xs:documentation>
</xs:annotation>
</xs:element>
<
xs:simpleType name="SizeOfProcedureEquipment">
<xs:annotation>
<xs:documentation> The size of the equipment used in the procedure on the patient</xs:documentation >
</xs:annotation>
<xs:union memberTypes ="NullValues">
<xs:simpleType>
<xs:restriction base ="xs:string">
<xs:minLength value ="2" />
<xs:maxLength value ="20" />
</xs:restriction>
</xs:simpleType>
</xs:union>
</xs:simpleType>
........................................
My guess is ,
<E19_04> 8</E19_04> , datastage input stage is removing spaces before digit '8' , thus only keeping '8' which is 1 char , and the definition is <xs:minLength value ="2" />
<xs:maxLength value ="20" /> which requires 2 char
How can I take the data as it is , I mean preserve the whitespaces in datastage input stage,
please suggest,
Thanks,
I am getting following error in xml input stage:
The job design is
folder stage....> xml input stage...> seq file stage.
[b]XML input document parsing failed. Reason: Xalan error (publicId: ,
systemId: , line: 826, column: 22): Datatype error:
Type:InvalidDatatypeValueException, Message:Value '8' does not match
any member types (of the union)[/b]
Line 826 is <E19_04> 8</E19_04> node which is giving problem.
The schema details for this node is as follows:
...................................................................
<xs:element xmlns:xs="http://www.w3.org/2001/XMLSchema " minOccurs="0" name="E19_04" type=" SizeOfProcedureEquipment">
<xs:annotation >
<xs:documentation>The size of the equipment used in the procedure on the patient </xs:documentation>
</xs:annotation>
</xs:element>
<
xs:simpleType name="SizeOfProcedureEquipment">
<xs:annotation>
<xs:documentation> The size of the equipment used in the procedure on the patient</xs:documentation >
</xs:annotation>
<xs:union memberTypes ="NullValues">
<xs:simpleType>
<xs:restriction base ="xs:string">
<xs:minLength value ="2" />
<xs:maxLength value ="20" />
</xs:restriction>
</xs:simpleType>
</xs:union>
</xs:simpleType>
........................................
My guess is ,
<E19_04> 8</E19_04> , datastage input stage is removing spaces before digit '8' , thus only keeping '8' which is 1 char , and the definition is <xs:minLength value ="2" />
<xs:maxLength value ="20" /> which requires 2 char
How can I take the data as it is , I mean preserve the whitespaces in datastage input stage,
please suggest,
Thanks,
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
There's more to this than meets the eye. DataStage itself typically isn't doing anything at this point...it's all xalan, who does the schema validation. Have you run this thru any other validation mechanisms? I haven't been out to the site in awhile, but I believe you can check validity here: http://www.w3.org/2001/03/webdata/xsv#hlp-warn
Alternatively, and perhaps even better, get a copy of XMLspy and put your document thru validation there.
I'll have to dig more into xs:union. I usually see minInclusive/maxInclusive, or a list of enumerated types.
Ernie
Alternatively, and perhaps even better, get a copy of XMLspy and put your document thru validation there.
I'll have to dig more into xs:union. I usually see minInclusive/maxInclusive, or a list of enumerated types.
Ernie
...and, if you go to www.apache.org, and find and download xerces, you will find http://xml.apache.org/xerces-c/stdinparse.html , and can perform command line validation.
This looks cool. Will have to try it myself, too. I've usually depended on XMLSpy, but have a new laptop and haven't re-installed my old license. This could be better. I'll follow up after I try it.
Bottom line --- check your xml instance document using external validation....if it fails validation, then there's either a bug in the parser (unlikely) or the usage case is wrong, or the data is represented incorrectly. If it passes validation, then we have to look more closely at why it fails under DS. Could be the data gets manipulated or damanged on the way into the parser, or we have too old a parser, or something else.
Ernie
This looks cool. Will have to try it myself, too. I've usually depended on XMLSpy, but have a new laptop and haven't re-installed my old license. This could be better. I'll follow up after I try it.
Bottom line --- check your xml instance document using external validation....if it fails validation, then there's either a bug in the parser (unlikely) or the usage case is wrong, or the data is represented incorrectly. If it passes validation, then we have to look more closely at why it fails under DS. Could be the data gets manipulated or damanged on the way into the parser, or we have too old a parser, or something else.
Ernie
Hi Ray,
How can I preprocess file ? I am getting file from vendor ? Is there any way I can pre process it , please suggest.(I checked through validator , it validates ok but fails in datastage)
Hi Ernie,
I have downloaded validator from vendor site , all error files validated ok ,
I checked the file in the validator and it gives <19_04> 8<19_04>
that means it preserves white space , but if I open same file with explorer
it trims white space and gives <19_04>8<19_04>,
I changed (<xs:minLength value="1"/>)xsd as follows :
<xs:simpleType name="SizeOfProcedureEquipment">
<xs:annotation>
<xs:documentation>The size of the equipment used in the procedure on the patient</xs:documentation>
</xs:annotation>
<xs:union memberTypes="NullValues">
<xs:simpleType>
<xs:restriction base="xs:string">
******** <xs:minLength value="1"/>********* which was '2' earlier
<xs:maxLength value="20"/>
</xs:restriction>
</xs:simpleType>
</xs:union>
</xs:simpleType>
but then too its showing the same error,
Please suggest me what can be the solution ,
Thanks for ur help,
How can I preprocess file ? I am getting file from vendor ? Is there any way I can pre process it , please suggest.(I checked through validator , it validates ok but fails in datastage)
Hi Ernie,
I have downloaded validator from vendor site , all error files validated ok ,
I checked the file in the validator and it gives <19_04> 8<19_04>
that means it preserves white space , but if I open same file with explorer
it trims white space and gives <19_04>8<19_04>,
I changed (<xs:minLength value="1"/>)xsd as follows :
<xs:simpleType name="SizeOfProcedureEquipment">
<xs:annotation>
<xs:documentation>The size of the equipment used in the procedure on the patient</xs:documentation>
</xs:annotation>
<xs:union memberTypes="NullValues">
<xs:simpleType>
<xs:restriction base="xs:string">
******** <xs:minLength value="1"/>********* which was '2' earlier
<xs:maxLength value="20"/>
</xs:restriction>
</xs:simpleType>
</xs:union>
</xs:simpleType>
but then too its showing the same error,
Please suggest me what can be the solution ,
Thanks for ur help,
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
[quote="ray.wurlod"]You could filter it through a [b]sed [/b]or [b]awk [/b]command (or maybe even [b]tr [/b]depending on your exact requirements). ...[/quote]
Hi All,
I thought the same but when I contacted vendor , he said he will resend all files replacing ' 8' by '8.0'
So for time being, the problem is solved as I am getting whitespace spaces only in one tag.
Thanks a lot for ur help,
Hi All,
I thought the same but when I contacted vendor , he said he will resend all files replacing ' 8' by '8.0'
So for time being, the problem is solved as I am getting whitespace spaces only in one tag.
Thanks a lot for ur help,