Parsing XML data from a column in DB
Moderators: chulett, rschirm, roy
Parsing XML data from a column in DB
Hi,
I have a requirement where I need to parse XML data from a column in a table from Database. I have used XML input stage but getting errors that the link has no derivation values. My design is as below
Oracle Stage -----> XML input stage -----> transformer------> seq file stage.
I really appreciate help on this one.
Regards,
R
I have a requirement where I need to parse XML data from a column in a table from Database. I have used XML input stage but getting errors that the link has no derivation values. My design is as below
Oracle Stage -----> XML input stage -----> transformer------> seq file stage.
I really appreciate help on this one.
Regards,
R
You need to first import your xml metadata. Send one row from your database into a sequential file with NONE for delimeter and quote character. Then import that via import....tabledefs....xml tabledefs.....check only the inner most boxes for one of the repeating nodes. Then import THAT tabledef to your input link of the xmlInput stage.
Ernie
Ernie
Ernie Ostic
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Depends a whole lot on the release, which Stage/Connector you are using, the length of the column, etc. In the most recent releases, xml datatypes have formal support, but there could also be issues with super long strings where you perhaps might need to access it by "reference".
That will be your first goal --- reading the table and getting the xml doc....after that, we can help you parse it.
I've had success with a lot of methods....longvarchar, trying the Connectors vs the older Stage types....trying Server Jobs vs Parallel.
A good idea is to create your own test edition of the table, and put a very small xml document in it...so you can prove the theory before working with the actual one, and also help narrow down if your problem is due to size of the document, the Stage, the release, the table itself, etc. etc. etc.
Ernie
That will be your first goal --- reading the table and getting the xml doc....after that, we can help you parse it.
I've had success with a lot of methods....longvarchar, trying the Connectors vs the older Stage types....trying Server Jobs vs Parallel.
A good idea is to create your own test edition of the table, and put a very small xml document in it...so you can prove the theory before working with the actual one, and also help narrow down if your problem is due to size of the document, the Stage, the release, the table itself, etc. etc. etc.
Ernie
Ernie Ostic
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Hi Eostic,
I tried doing that but Oracle Enterprise stage is not supporting CLOB datatype and used DRS stage instead but still getting this error "Oracle_Enterprise_0: Error when checking operator: Caught parsing exception during wrapDescribeOperator(): Could not find type: generic:"
I appreciate your help on this one.
Regards
Raj
I tried doing that but Oracle Enterprise stage is not supporting CLOB datatype and used DRS stage instead but still getting this error "Oracle_Enterprise_0: Error when checking operator: Caught parsing exception during wrapDescribeOperator(): Could not find type: generic:"
I appreciate your help on this one.
Regards
Raj
It might be worth putting that specific issue into its own thread (separate of the xml concern afterwards). I know that other people in dsxchange have had various levels of success pulling out such datatypes using things like longvarchar.
Ernie
Ernie
Ernie Ostic
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Hi Eostic,
I have tried using Oracle Connector stage in Datastage 9.1 and I can write the rows to a sequential file as you mentioned . But when I tried to import that sequential as XML table definition I am getting this error.
"There are multiple root elements at line 10 position 2" . What does this error mean . Is there any issue with the XML tags.
I really appreciate your help.
Thanks in Advance
Raj
I have tried using Oracle Connector stage in Datastage 9.1 and I can write the rows to a sequential file as you mentioned . But when I tried to import that sequential as XML table definition I am getting this error.
"There are multiple root elements at line 10 position 2" . What does this error mean . Is there any issue with the XML tags.
I really appreciate your help.
Thanks in Advance
Raj
Hi Chulett,
Here is the XML that I need to parse out , I first loaded this into the sequential file and need to parse out using XML input stage or transformer not sure what to use.
Here is the XML that I need to parse out , I first loaded this into the sequential file and need to parse out using XML input stage or transformer not sure what to use.
Code: Select all
<my:myFields xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/2010-01-25T12:37:20">
<my:CALLER>HOLOMAN, RITA </my:CALLER>
<my:PHONE></my:PHONE>
<my:PHONE_EXT></my:PHONE_EXT>
<my:PHONEEVE>631-231-4770</my:PHONEEVE>
<my:PHONEEVE_EXT></my:PHONEEVE_EXT>
<my:Complete>Y</my:Complete>
</my:myFields>
Perfect....that's probably "one row" from your Oracle stage written to the sequential file.
Make sure when you do the import that you only try to import a file that has THAT ONE SINGLE ROW.
You can't import a file that contains "many" of those rows, because as a single sequential file, it is invalid xml. You have a sequential file with a large "collection" of individual documents.
Put just one row into its own file, as you have shown above, and then do the import. Then check the INNERMOST boxes for each of your SIX columns. That's all you'll have is 6 columns. Make any ONE of them a key.
Put that table def onto the output link of your xmlInput Stage.
Ernie
Make sure when you do the import that you only try to import a file that has THAT ONE SINGLE ROW.
You can't import a file that contains "many" of those rows, because as a single sequential file, it is invalid xml. You have a sequential file with a large "collection" of individual documents.
Put just one row into its own file, as you have shown above, and then do the import. Then check the INNERMOST boxes for each of your SIX columns. That's all you'll have is 6 columns. Make any ONE of them a key.
Put that table def onto the output link of your xmlInput Stage.
Ernie
Ernie Ostic
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
We will assume that you are using the xmlInput Stage, and that it has an input link and an output link.
Let's further assume that your input link looks like this:
col1
col2
col3
myXMLcolumn
3 columns that could be keys or whatever from your source rdbms, along with a string of some type that contains xml content.
The xmlInput Stage supports the concept of "passthru". In the input side of the Stage, indicate that the column myXMLcolumn contains "xml content" (there's a radio button for that).
Then on the output link side of the Stage, you will minimally have:
col1
col2 (all three columns with same exact spelling and all metadata)
col3
CALLER
PHONE
PHONE_EXT these with xpath in the description property
etc. with only one marked as "key"
NOthing else in the output link marked as a key. and for the non xml columns, put NOTHING in their Description properties.
Ernie
Let's further assume that your input link looks like this:
col1
col2
col3
myXMLcolumn
3 columns that could be keys or whatever from your source rdbms, along with a string of some type that contains xml content.
The xmlInput Stage supports the concept of "passthru". In the input side of the Stage, indicate that the column myXMLcolumn contains "xml content" (there's a radio button for that).
Then on the output link side of the Stage, you will minimally have:
col1
col2 (all three columns with same exact spelling and all metadata)
col3
CALLER
PHONE
PHONE_EXT these with xpath in the description property
etc. with only one marked as "key"
NOthing else in the output link marked as a key. and for the non xml columns, put NOTHING in their Description properties.
Ernie
Ernie Ostic
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>