Page 1 of 1

XML stage does not process the first row

Posted: Fri Mar 08, 2013 8:17 am
by dspr
Hi , I am using XML stage as a target to write a simple XML file. My issue that the XML file does not process the first row of the file always.

If I use again file target as another downstream, all rows are processed correctly. Below are some details of the input file,XSD and the XML output.

Input:-

First,Last,Title===>Header
a,s,d========>This row is skipped always
s,d,f
f,g,h

XSD:-

Code: Select all

<xs:element name="Company">
		<xs:complexType>
			<xs:sequence>
				<xs:element name="Address" maxOccurs="unbounded">
					<xs:complexType>
						<xs:sequence>
							<xs:element name="First" type="xs:string"/>
							<xs:element name="Last" type="xs:string"/>
							<xs:element name="Title" type="xs:string"/>
						</xs:sequence>
					</xs:complexType>
				</xs:element>
			</xs:sequence>
		</xs:complexType>
	</xs:element>
</xs:schema>
XML output generated using XML stage:-

- <Company xmlns="http://my-company.com/namespace">
- <Address>
<First>s</First>
<Last>d</Last>
<Title>f</Title>
</Address>
- <Address>
<First>f</First>
<Last>g</Last>
<Title>h</Title>
</Address>
</Company>

Posted: Mon Mar 11, 2013 2:29 am
by dspr
Dear XML experts, any comments on the below issue?

Posted: Mon Mar 11, 2013 3:14 am
by prasson_ibm
Hi,

Are you getting any warning in Log? What is your source stage?

Posted: Mon Mar 11, 2013 3:17 am
by prasannakumarkk
Also post Xpath used for each column.

Posted: Mon Mar 11, 2013 6:30 am
by eostic
and note which column is checked as the "key" (repetition element).

Ernie

Posted: Mon Mar 11, 2013 7:58 am
by dspr
Hi All,

I am using a simple sequential file as a source.
I am not getting any warnings in the log and the job succeeds. I am using the new XML stage as target so it does not give me the XPath anywhere.
I also can't see any option of clicking repeated elements or key column in the new XML stage.
I just map the columns in the assembly editor and that is it. In assemble editor I only have Input step, composer and the output step.
If I pass the header along in the downstream then the XML skips the header and writes the records from the first row.


Please help.

Posted: Mon Mar 11, 2013 11:53 am
by eostic
You say "header"..... but do you really mean "the first row"? Often a header means a row that is in some different type of format.

If we assume that you simply have three TOTAL rows, then perhaps your schema has other validating criteria. Are the datatypes correct? are there any enumerations?

Don't worry about the xpath and the repetition element. You are using a different Stage.

Ernie

Posted: Mon Mar 11, 2013 1:34 pm
by ray.wurlod
Do you have "first line is column names" set in the Sequential File stage?

Posted: Tue Mar 12, 2013 2:36 am
by dspr
Hi Ernie and All,

Yes you are correct. The stage always skips the the first row. I have set first row as header in source seq file so the header is not going forward anyways.
I unchecked treat first row as header from the sequential source file. Now in this case the header along with rows goes downstream thru X'fer to the XML stage and now the XML stage skips the header and processes the records from the first row. So the conclusion is:-
XML stage skips the first row irrespective of what it is header or actual rows.. Now if I want that the data should not be skipped, the only workaround I have is to pass the header along.
Please also refer the XSD above in my first post. It has no restrictions I can see and all data types are string.
I am not sure why the XML stage keeps on skipping the first row. I have tried this with different XSD's and the behavious is same for all the XSD's.
Please help.

Posted: Tue Mar 12, 2013 5:20 am
by eostic
At the moment, I can't imagine. Never seen it do that. ...unless there is other logic in your Assembly, like a pivot or an aggregator or a switch, etc. that is causing it to drop. Put a transformer in front of the xml and send the rows to another sequential file. Be sure your first row (let's stop calling it a header) is in that file.

Ernie

Posted: Tue Mar 12, 2013 7:50 am
by dspr
I used a transformer and a file as a target along with XML stage. All rows are written to the file target but with XML stage its again the same (total rows -1).So the first row is skipped in all cases with all of my XSD's. The first row passes through the transformer and is not written to the output link from transformer to the XML stage and the same is correctly written to the output link which goes from the transformer to the file target.
Next steps which I will try is :-
1.Use database as source and XML stage as target and then check
2.Try on another server
3.Try with more XSD's

Please let me know if you can think of any other solution.

Posted: Tue Mar 12, 2013 8:10 am
by eostic
Some other thoughts...

a) make certain that you do NOT have validation selected in the Composer step
b) do tests with 100% character string data (varchar, char on the links), and find (for testing) an xsd that uses only xs:string elements.

Also...what are the sequence of Steps in your Assembly? Are you using a variety of reGroups to aggregate upper level node information?

Ernie

Posted: Tue Mar 12, 2013 3:12 pm
by ray.wurlod
Is the file missing a line terminator at the end of the header row?

Posted: Wed Mar 13, 2013 6:42 am
by eostic
Check that first row. Make sure also that there aren't any other reasons that the row might be dropped. Any nulls or anything in that row? And as noted earlier, for now, make sure that all your data is 100% character, and present, and loading only to xs:string type elements.

Ernie