Ampersand sign in XML Stage Causing error

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
darrreever
Participant
Posts: 19
Joined: Tue Feb 23, 2010 11:15 am
Location: Los Angeles

Ampersand sign in XML Stage Causing error

Post by darrreever »

Hello DataStagers:

I am reading a string from a 9.7DB2 table. The string contains the ampersand "&" sign. (an example "Johnson & Johnson")When I pass the VARCHAR string to an XML Input Stage (not the 8.5 XML Stage) with fatal, error, and warning set to go to a reject link, the record gets rejected.

The same behaviour happens with the !@#$% symbols as well. However, when I use NVARCHAR as the string data type, all of the special symbols pass through fine. The record containing the & is still rejected.

I understand that the "&" has special meaning in both XML and HTML and so this is probably the issue. As you know, the symbols I listed, as well as &, are all part of the ascii set and not double byte or require unicode handling.

Has anyone experienced/solved this problem?

Thanks
God Bless
Darryl
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

That ampersand needs to be replaced with & to be valid in XML data.
-craig

"You can never have too many knives" -- Logan Nine Fingers
darrreever
Participant
Posts: 19
Joined: Tue Feb 23, 2010 11:15 am
Location: Los Angeles

Post by darrreever »

Hello Mr. Craig!

Thanks for the quick response! Right on the money, yet again! :D

You are exactly right. Easily checked by simply putting the xml message in notepad, saving as an .xml, and viewing in IE or other browser. With just a '&' the xml will not view, but replacing '&' with '&' (no quotes) it displays perfectly.

Now comes the perennial problem of how to search for more than one occurence of a string and replace with another string in a parallel job.

Once again thanks.
Darryl
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The EReplace() function can replace multiple instances. If you're earlier than 9.1, track down the pxEReplace() function published on DSXchange.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rameshrr3
Premium Member
Premium Member
Posts: 609
Joined: Mon May 10, 2004 3:32 am
Location: BRENTWOOD, TN

Post by rameshrr3 »

replace all double quotes ", ampersand &, apostrophe ' [single quotes] ,greater-than >, less-than < signs with &quot,&amp,&apos,&gt,&lt respectively . I guess XML parser considers these as reserved characters. Create a function that replaces these characters one after another - possibly using cascaded Change() or Ereplace() for server jobs or use some equivalent in c/C++ for a custom parallel routine, the DB2 SQL function library should also have some functions to replace characters.
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

All of this is a correct discussion about escape characters and xml, but it would be useful to add some more context....

The initial post notes:

"I am reading a string from a 9.7 DB2 table. The string contains the ampersand sign........When I pass the VARCHAR string to an XMLInput Stage......."

So... is the "string" that is coming from DB2 already an xml document or chunk of xml content?

If so, then yes, everything said here is correct --- however, the tooling that generated the string is wrong --- it should be converting that ampersand automatically BEFORE it loads it into db2.....

Indeed, if DataStage xmlOutput Stage (or xml stage) was being used to "generate" this xml, it would be escaped automatically....

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
rameshrr3
Premium Member
Premium Member
Posts: 609
Joined: Mon May 10, 2004 3:32 am
Location: BRENTWOOD, TN

Post by rameshrr3 »

I have seen some XML where the element data had ampersands, evidently not all xml generating tools make these conversions. Interstingly even datastage 8.0x had a bug when you defined an environment variable whose value is enclosed in opening and closing tag pair like <env_value>.

The one question to the OP , is did you set the data element for this column to 'XML' ?

Thanks
Ramesh
Post Reply