Parsing XML data from a column in DB

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
rajeevm
Participant
Posts: 135
Joined: Sun Jan 22, 2006 10:44 am

Parsing XML data from a column in DB

Post by rajeevm »

Hi,

I have a requirement where I need to parse XML data from a column in a table from Database. I have used XML input stage but getting errors that the link has no derivation values. My design is as below

Oracle Stage -----> XML input stage -----> transformer------> seq file stage.

I really appreciate help on this one.

Regards,
R
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

You need to first import your xml metadata. Send one row from your database into a sequential file with NONE for delimeter and quote character. Then import that via import....tabledefs....xml tabledefs.....check only the inner most boxes for one of the repeating nodes. Then import THAT tabledef to your input link of the xmlInput stage.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
rajeevm
Participant
Posts: 135
Joined: Sun Jan 22, 2006 10:44 am

Post by rajeevm »

I tried to pull one row which has the XML information from the table to the sequential file but receiving fatal error that operated terminated abnormally SIGSEV . Is there any other way to pull this info .
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Any chance you were given an xsd for that XML? If so you can import from that.
-craig

"You can never have too many knives" -- Logan Nine Fingers
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Depends a whole lot on the release, which Stage/Connector you are using, the length of the column, etc. In the most recent releases, xml datatypes have formal support, but there could also be issues with super long strings where you perhaps might need to access it by "reference".

That will be your first goal --- reading the table and getting the xml doc....after that, we can help you parse it.

I've had success with a lot of methods....longvarchar, trying the Connectors vs the older Stage types....trying Server Jobs vs Parallel.

A good idea is to create your own test edition of the table, and put a very small xml document in it...so you can prove the theory before working with the actual one, and also help narrow down if your problem is due to size of the document, the Stage, the release, the table itself, etc. etc. etc.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
rajeevm
Participant
Posts: 135
Joined: Sun Jan 22, 2006 10:44 am

Post by rajeevm »

Hi Eostic,

I tried doing that but Oracle Enterprise stage is not supporting CLOB datatype and used DRS stage instead but still getting this error "Oracle_Enterprise_0: Error when checking operator: Caught parsing exception during wrapDescribeOperator(): Could not find type: generic:"

I appreciate your help on this one.

Regards
Raj
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

It might be worth putting that specific issue into its own thread (separate of the xml concern afterwards). I know that other people in dsxchange have had various levels of success pulling out such datatypes using things like longvarchar.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
rajeevm
Participant
Posts: 135
Joined: Sun Jan 22, 2006 10:44 am

Post by rajeevm »

Hi Eostic,

I have tried using Oracle Connector stage in Datastage 9.1 and I can write the rows to a sequential file as you mentioned . But when I tried to import that sequential as XML table definition I am getting this error.

"There are multiple root elements at line 10 position 2" . What does this error mean . Is there any issue with the XML tags.

I really appreciate your help.

Thanks in Advance
Raj
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Post the XML.
-craig

"You can never have too many knives" -- Logan Nine Fingers
rajeevm
Participant
Posts: 135
Joined: Sun Jan 22, 2006 10:44 am

Post by rajeevm »

Hi Chulett,

Here is the XML that I need to parse out , I first loaded this into the sequential file and need to parse out using XML input stage or transformer not sure what to use.

Code: Select all

<my:myFields xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/2010-01-25T12:37:20">
  <my:CALLER>HOLOMAN, RITA </my:CALLER>
  <my:PHONE></my:PHONE>
  <my:PHONE_EXT></my:PHONE_EXT>
  <my:PHONEEVE>631-231-4770</my:PHONEEVE>
  <my:PHONEEVE_EXT></my:PHONEEVE_EXT>
  <my:Complete>Y</my:Complete>
</my:myFields>
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Perfect....that's probably "one row" from your Oracle stage written to the sequential file.

Make sure when you do the import that you only try to import a file that has THAT ONE SINGLE ROW.

You can't import a file that contains "many" of those rows, because as a single sequential file, it is invalid xml. You have a sequential file with a large "collection" of individual documents.

Put just one row into its own file, as you have shown above, and then do the import. Then check the INNERMOST boxes for each of your SIX columns. That's all you'll have is 6 columns. Make any ONE of them a key.

Put that table def onto the output link of your xmlInput Stage.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
rajeevm
Participant
Posts: 135
Joined: Sun Jan 22, 2006 10:44 am

Post by rajeevm »

Hi Eostic,

That means I need to import that single XML chunk into table definition , what If I had different XML formats written to the same sequential file but different rows. Can that be combined and parsed.

Please let me know

Raj
rajeevm
Participant
Posts: 135
Joined: Sun Jan 22, 2006 10:44 am

Post by rajeevm »

Hi All,

How do I get the other key columns from the database table apart from the column with XML data in it. There are 3 more columns to be copied over to the output file. Is it possible thru XML stage or how should I proceed it.

I really appreciate your help

Regards,
Raj
rajeevm
Participant
Posts: 135
Joined: Sun Jan 22, 2006 10:44 am

Post by rajeevm »

Hi All,

I could able to parse the data but have some other columns in the table I need to extract them to output. I have tried not giving the xpath description but could not get those columns and throwing some weird errors.

Please help me on this one.

Regards
raj
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

We will assume that you are using the xmlInput Stage, and that it has an input link and an output link.

Let's further assume that your input link looks like this:

col1
col2
col3
myXMLcolumn

3 columns that could be keys or whatever from your source rdbms, along with a string of some type that contains xml content.

The xmlInput Stage supports the concept of "passthru". In the input side of the Stage, indicate that the column myXMLcolumn contains "xml content" (there's a radio button for that).

Then on the output link side of the Stage, you will minimally have:

col1
col2 (all three columns with same exact spelling and all metadata)
col3
CALLER
PHONE
PHONE_EXT these with xpath in the description property
etc. with only one marked as "key"

NOthing else in the output link marked as a key. and for the non xml columns, put NOTHING in their Description properties.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Post Reply