Page 1 of 2

Reject in XML input stage

Posted: Fri Aug 29, 2008 9:02 am
by goutam
Hi all,

I am reading a XML file through XML input stage. I am passing path of this file through external source stage. My requirement is to reject the records those couldn't be read due to wrong metadata.

For this I took a second output link from XML stage and mark the link as reject in the output stage properties of XML input stage. Also I kept the same columns in the reject link as columns in the input link of XML input stage. To reject a record I made some mistake in the metadata like changing datatype from varchar to integer.


But when i ran the job , records are not rejecting rather null values are populated in the column having wrong datatype. Any other way to do this???

Posted: Fri Aug 29, 2008 9:52 am
by throbinson
You need to define a reject link for your reject link.

Just kidding!

In order for this to work as a reject within the XML Input Stage, you'll need to Validate the Input XML via schema validation. This means you'll need an XSD. Do you have one? The documentation is very good: xmLPACK_20_Designer.pdf page 31.

Posted: Fri Aug 29, 2008 10:30 am
by goutam
throbinson wrote:You need to define a reject link for your reject link.

Just kidding!

In order for this to work as a reject within the XML Input Stage, you'll need to Validate the Input XML via schema validation. This means you'll need an XSD. Do you have one? The documentation is very good: xmLPACK_20_Designer.pdf page 31.
Yes, i have checked validate schema check box in the XML stage... but the xpath xpression present in XML input stage has been entered manually..is this the reason why i am not getting records on reject link?

Posted: Fri Aug 29, 2008 10:37 am
by throbinson
You can enter the XPaths manually. If you have the XSD defined correctly, then any typos you make in the XPaths will most likely throw a reject. Have you correctly defined the XSD that will be used to validate the XML?

Posted: Fri Aug 29, 2008 10:48 pm
by goutam
throbinson wrote:You can enter the XPaths manually. If you have the XSD defined correctly, then any typos you make in the XPaths will most likely throw a reject. Have you correctly defined the XSD that will be used to validate the XML?
Yes..Xpath expression is correct and job is running fine. But the record that don't match with metadata is not coming to reject link rather it is passed to stream link(main output) with null value.

Posted: Sat Aug 30, 2008 6:24 am
by eostic
There was an anomaly in 8.0 where validations fail to be caught, and I believe there is a fix, but I don't know the number. Report it....I'm 90% certain it has been corrected.

Ernie

Posted: Sat Aug 30, 2008 9:19 am
by throbinson
I didn't ask if the XPath was correct. I asked if the XSD was defined correctly. That is, within the XML as an element with an attribute of schemaLocation. However, since this is 8.0 there may be other ways to validate. The functionality may have changed. I am in 7.5.3 where the XML must contain the XSD in order to validate the XML.

Posted: Sat Aug 30, 2008 8:33 pm
by eostic
As far as I know, throbinson is correct --- this is still the same method in v8...the xsd must be indicated in the xml instance. So, make sure the syntax fits as throbinson has stated, and then check with support for the validation anomaly.

Ernie

Posted: Tue Sep 02, 2008 9:33 am
by goutam
eostic wrote:As far as I know, throbinson is correct --- this is still the same method in v8...the xsd must be indicated in the xml instance. So, make sure the syntax fits as throbinson has stated, and then check with support for the validation anomaly.

Ernie
I have the following XML document.

<?xml version="1.0" encoding="UTF-8" ?>
<RECORD SchemaLocation="copysheet.xsd">
<DGTL_ASSET_ID>200822010145799020</DGTL_ASSET_ID>
<PKGFMT>LABEL</PKGFMT>
<CPYDOC>P685892</CPYDOC>
<CEDTR>GIBSON VENITA TECH.</CEDTR>
<COPYSHEET_TYPE>SY</COPYSHEET_TYPE>
<DGTL_ASSET_ID>20000000000001</DGTL_ASSET_ID>
<PKGFMT>LABEL</PKGFMT>
<CPYDOC>P685884</CPYDOC>
<CEDTR>GIBSON VENITA TECH.</CEDTR>
<COPYSHEET_TYPE>SY</COPYSHEET_TYPE>
</RECORD>



The XML schema is copysheet.xsd and is defined as

<?xml version="1.0" encoding="ISO-8859-1" ?>
<xs:schema>
<xs:element name="RECORD">
<xs:complexType>
<xs:sequence>
<xs:element name="DGTL_ASSET_ID" type="xs:decimal"/>
<xs:element name="PKGFMT" type="xs:string"/>
<xs:element name="CPYDOC" type="xs:string"/>
<xs:element name="CEDTR" type="xs:string"/>
<xs:element name="COPYSHEET_TYPE" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

First i ran the job with a metadata as defined in xsd. the job ran fine.

when i changed the datatype of DGTL_ASSET_ID to date , the job still ran fine with warning "XML_Input_3,0: Warning: Input_XML_Test.XML_Input_3: XSLT Processor: Column Name = "DGTL_ASSET_ID": Value = "200822010145799020" is an invalid date."


Rather record should go to reject link.

Please guide me if there is anything wrong.

Posted: Tue Sep 02, 2008 9:41 am
by chulett
You should provide a full pathname to the xsd. And it should be "schemaLocation" as far as I know, the case matters as well.

Posted: Tue Sep 02, 2008 10:06 am
by throbinson
You're almost Golden! The XSD is working since it is correctly identifying the element as a date. You need to change the mapping of the warning from warning to Reject and it will work. You don't have to fully qualify the path unless you want to. You probably should but the XML Input stage IS finding it where you've got it. The Project subdirectory? Not a Best Practice. It should be put somewhere else and either hard code or parameterize the location.

Posted: Tue Sep 02, 2008 10:10 am
by chulett
Exactly, it would need to be in the Project folder (and moved with the job) for that relative path to work. As noted, not a Best Practice.

Posted: Tue Sep 02, 2008 11:10 am
by goutam
throbinson wrote:You're almost Golden! The XSD is working since it is correctly identifying the element as a date. You need to change the mapping of the warning from warning to Reject and it will work. You don't have to fully qualify the path unless you want to. You probably should but the XML Input stage IS finding it where you've got it. The Project subdirectory? Not a Best Practice. It should be put somewhere else and either hard code or parameterize the location.
throbinson,

I mapped warning to reject on transformation page of XML input stage. Still records are not going to reject link. Both the XML document and XML schema are in same location and is not a projct sub directory.Do i need to give full path of XSD in XML document?

Posted: Tue Sep 02, 2008 11:36 am
by throbinson
No. I'm pretty sure it is finding the XSD. I believe te validation of the XML is taking place. It is the reject that is not happening. This now goes to Ernie's comment about a patch. If you have mapped the warning/fatal/informational DS log message to Reject, then it should have rejected upon an improper DATE. I'm out of ideas. :cry:

Posted: Tue Sep 02, 2008 12:02 pm
by chulett
I think you are right about the 'finding it' part. I went back through an old job that uses that feature and while it is an XML Output stage rather than an input, the annotation says it needs to be in the same directory as the xml for the 'relative' path in the schemaLocation to find it.