XML Input stage
Moderators: chulett, rschirm, roy
XML Input stage
Hello,
I am trying to parse an XML document using
External source--->XML_input---->Sequentialfile
Gave all the properties in the stages and I see no warnings or errors but the output looks different.
Input is <ENTITY>172</ENTITY>
<ACCOUNT>0000000006196</ACCOUNT>
<TRACK>0</TRACK>
<USER_CD>1101</USER_CD>
Output I get is ENTITY:<ENTITY>172</ENTITY>ACCOUNT:<ACCOUNT>0000000006196</ACCOUNT>TRACK:<TRACK>0</TRACK>USER_CD:<USER_CD>1101</USER_CD>
I was expecting ENTITY:172
ACCOUNT:0000000006196
TRACK:0
USER_CD:1101
can someone help me why am I getting the tags also in the output?
Thank you!
I am trying to parse an XML document using
External source--->XML_input---->Sequentialfile
Gave all the properties in the stages and I see no warnings or errors but the output looks different.
Input is <ENTITY>172</ENTITY>
<ACCOUNT>0000000006196</ACCOUNT>
<TRACK>0</TRACK>
<USER_CD>1101</USER_CD>
Output I get is ENTITY:<ENTITY>172</ENTITY>ACCOUNT:<ACCOUNT>0000000006196</ACCOUNT>TRACK:<TRACK>0</TRACK>USER_CD:<USER_CD>1101</USER_CD>
I was expecting ENTITY:172
ACCOUNT:0000000006196
TRACK:0
USER_CD:1101
can someone help me why am I getting the tags also in the output?
Thank you!
Thanks,
Surya
Surya
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
XML input stage:
Under stage->transformation Settings->selected repetition element required
Input->XMLsource->gave filename and selected URL/Filepath
-->columns--->gave filename with varchar (no) an length 255
output->transformation Settings->selected repetition element required
---> columns -gave all the columns except the top one
Now after trying some options the output looks like
ENTITY
<ENTITY>172</ENTITY>
ACCOUNT
<ACCOUNT>0000000006196</ACCOUNT>
TRACK
<TRACK>0</TRACK>
USER_CD
<USER_CD>1101</USER_CD>
which is the column name is good with entity or account or track or user_code and the records are also under the corresponding column but tags are not removed. Not sure what the reason could be
Under stage->transformation Settings->selected repetition element required
Input->XMLsource->gave filename and selected URL/Filepath
-->columns--->gave filename with varchar (no) an length 255
output->transformation Settings->selected repetition element required
---> columns -gave all the columns except the top one
Now after trying some options the output looks like
ENTITY
<ENTITY>172</ENTITY>
ACCOUNT
<ACCOUNT>0000000006196</ACCOUNT>
TRACK
<TRACK>0</TRACK>
USER_CD
<USER_CD>1101</USER_CD>
which is the column name is good with entity or account or track or user_code and the records are also under the corresponding column but tags are not removed. Not sure what the reason could be
Thanks,
Surya
Surya
There is some issue with the main tag which is the ID. Now after using it as ID I got the right output when pulling the XML from file which has its data loaded from a table.
When reading the same CLOB field from the DB directly, it is throwing an error which says "Xalan fatal error" due to Invalid document structure"
The job design is DBstage--->Transformer---->XMLinput----->Sequentialfile
I am not using the external source stage.....as there is no path for the table...
Please suggest,Thank you
When reading the same CLOB field from the DB directly, it is throwing an error which says "Xalan fatal error" due to Invalid document structure"
The job design is DBstage--->Transformer---->XMLinput----->Sequentialfile
I am not using the external source stage.....as there is no path for the table...
Please suggest,Thank you
Thanks,
Surya
Surya
The result you are getting is by design. Sometimes it is useful to retrieve whole tags.
Open your output link and scroll all the way to the right and review the Description property.
The contents are formal "xpath". When it ends in a slash, you get the whole tag. When it ends in /text(), you get the "contents" of the tag.
Ernie
Open your output link and scroll all the way to the right and review the Description property.
The contents are formal "xpath". When it ends in a slash, you get the whole tag. When it ends in /text(), you get the "contents" of the tag.
Ernie
Ernie Ostic
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
That's where I was heading when I asked you for your xpath expressions. Ernie pre-empted me. But that's OK - the more people answering questions well, the better.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
I removed the text() after / which made the derivation ending with / but still I get the warning and the records are not passing after XML input stage.
Below are the warnings
XML input document parsing failed.
Xalan fatal error (publicId: , systemId: , line: 2, column: 2): Invalid character (Unicode: 0x0)
XML input document parsing failed.
Xalan fatal error (publicId: , systemId: , line: 1, column: 6): There are more end tags than start tags
XML input document parsing failed.
Xalan fatal error (publicId: , systemId: , line: 2, column: 1): Invalid document structure
Thanks again
Below are the warnings
XML input document parsing failed.
Xalan fatal error (publicId: , systemId: , line: 2, column: 2): Invalid character (Unicode: 0x0)
XML input document parsing failed.
Xalan fatal error (publicId: , systemId: , line: 1, column: 6): There are more end tags than start tags
XML input document parsing failed.
Xalan fatal error (publicId: , systemId: , line: 2, column: 1): Invalid document structure
Thanks again
Thanks,
Surya
Surya
Which problem are we helping to resolve? Take my post and apply it to your working job in the very first post. That will show you how to get whole tags (what your very first post is getting) vs just the "value" of the tags (what your first post says you wanted).
Ernie
Ernie
Ernie Ostic
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Actually when waiting for the solution for the issue, I tried some options and was updating the output each time I made changes.
Now I get the output from the job in first post as wanted but only when I use External source.My requirement is to get the same output when I read the CLOB field information from the table instead of reading from a sequential file.
Please let me know if I need to open a new post for this, I felt its the same parsing XML from file Vs parsing XML from table so continued....
Now I get the output from the job in first post as wanted but only when I use External source.My requirement is to get the same output when I read the CLOB field information from the table instead of reading from a sequential file.
Please let me know if I need to open a new post for this, I felt its the same parsing XML from file Vs parsing XML from table so continued....
Thanks,
Surya
Surya
First write a job that sends your clob to a sequential file without the xml stage. Use "none" for delimeter and quote character. Look at the resulting xml. Make sure it is clean and can be opened in your browser as well formed xml.
Once you have that working, send into your xmlInput but use the "content" option instead of the url option.
Ernie
Once you have that working, send into your xmlInput but use the "content" option instead of the url option.
Ernie
Ernie Ostic
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
The way which Ernie suggested worked for me, loaded the XML CLOB into a file and then parsed it with XML input stage but I have around 15 more transformations to do which should include CLOB column and other columns of the table.
So it needs to be a single job, that is the reason I was asking for parsing technique from the table directly.
Sorry for the confusion, hope you understand!
So it needs to be a single job, that is the reason I was asking for parsing technique from the table directly.
Sorry for the confusion, hope you understand!
Thanks,
Surya
Surya