Page 1 of 1

Resolving Special characters in XML

Posted: Thu Aug 11, 2011 1:03 pm
by parag.s.27
We are facing an issue with special characters coming in XML message. For e.g. "&","<",">" etc. Now these special characters are automatically handle by DataStage for all the elements except the "repeating elements".

I could figure out that issue is happening because of the way repeating elements are created. We did use a transformer to hardcode the XML namespace as a text and later convert it in XML stage by specifying the DATA ELEMENT as "XML". The example is mentioned below: -

we had to provide an XML message on MQ where one particular Address section should look like following: -

Code: Select all

<ns5:Address>
  <ns4:Line>1500 SCENIC DR</ns4:Line> 
  <ns4:Line>CROSSROADS & AVE</ns4:Line> 
  <ns4:Line>NEAR OLD CHURCH</ns4:Line> 
  <ns4:Line /> 
</ns5:Address>
The point here is a repeating element of "Line". For Address Line1, 2, 3, 4. I could produce such result by using a Transformer prior to XML stage and concatenated all address elements along with Name Space like : -

Code: Select all

'<ns4:Line>':Addre Line 1:'</ns4:Line>':'<ns4:Line>':Addre Line 2:'</ns4:Line>':'<ns4:Line>':Addre Line 3:'</ns4:Line>':'<ns4:Line>':Addre Line 4:'</ns4:Line>'.
To Summarize, the special characters are not handled by XML Output stage if a XML element is constructed as text and later converted to XML using DATA ELEMENT as XML.

This can be resolved by sending all address lines as it is from one XML stage and then, just before second XML stage the transformer should be used with Field function to extract handled value between the tags (<addrline1>........)and constructing the repeating element again.

But this is only possible if number of repeating elements are less, we have a job where there are 32 different repeating elements. I wanted to know if there is any robust method using XML stage of something in DataStage to handle all such scenarios.

Posted: Thu Aug 11, 2011 1:58 pm
by chulett
I don't have the time to fully absorb what you need here, but this caught my eye:
We did use a transformer to hardcode the XML namespace as a text and later convert it in XML stage by specifying the DATA ELEMENT as "XML".
Just want to point out that using that data element does not convert anything. You use that to tell the parser that something is already XML and for it to not touch it. FYI.

Posted: Thu Aug 11, 2011 2:09 pm
by parag.s.27
What I wanted to say is, if Data Element property is set then as you said, DataStage will not handle the Special characters such as &, <, > etc in that element.

Also if I have some repeating elements then I need to perform some transformations in a transformer prior to XML stage to create a repeating node. This transformation is some kind of hardcoding of XML tags which is taken as text by DataStage and hence not handled for special characters.

We needed help on how to tell DataStage, especially XML output stage to handle special characters for above two scenarios

Posted: Thu Aug 11, 2011 2:30 pm
by parag.s.27
Thanks Chulett for your time.

I resolved it in a cleaner way. The Idea is similar to what you said.

Posted: Thu Aug 11, 2011 3:26 pm
by chulett
Can you explain your cleaner way, please?

Posted: Thu Aug 11, 2011 3:36 pm
by parag.s.27
Though I got partial success but I think I'll resolve it further. By partial success I meant that in Internet explorer it is showing the entire element as text(in black color) but on Tibco side it is able to parse it but not completely because I did not send all elements for testing.

I used something like this: -

Code: Select all

'<![CDATA[<ns4:Line>]]>':UpCase(NullToValue(Lnk_ExtExpRole.KDAE_APX_PARS_LN1_AD,'')):'<![CDATA[</ns4:Line>]]>':'<![CDATA[<ns4:Line>]]>':UpCase(NullToValue(Lnk_ExtExpRole.KDAE_APX_PARS_LN2_AD,'')):'<![CDATA[</ns4:Line>]]>':'<![CDATA[<ns4:Line>]]>':UpCase(NullToValue(Lnk_ExtExpRole.KDAE_APX_PARS_LN3_AD,'')):'<![CDATA[</ns4:Line>]]>':'<![CDATA[<ns4:Line>]]>':UpCase(NullToValue(Lnk_ExtExpRole.KDAE_APX_PARS_LN4_AD,'')):'<![CDATA[</ns4:Line>]]>'

Posted: Sun Aug 14, 2011 7:44 pm
by eostic
That was the perfect solution, and "by design". The XML data element is intended for those situations when YOU are taking sole responsibility for building xml -- which has to include escaping any characters that require it.......

Ernie

Posted: Sun Aug 14, 2011 9:02 pm
by parag.s.27
That is so true Ernie. Though I've been working with XML and WSDL in IIS for quite some time now, but every day is a new learning. I am just loving, how you can implement many things in DataStage which are not part of IIS architecture

Extra scenario

Posted: Fri Aug 19, 2011 10:55 am
by pnpmarques
I had a similar situation (special chars) using XML Output Stage. I was trying to put together several elements with varchar columns containing xml code in it.
I thought it would be enough to set Data Element=XML on the OUTPUT link, but I found that it was also necessary to set all columns in the INPUT link with Data Element=XML.

Posted: Sun Aug 21, 2011 4:20 pm
by eostic
As you discovered, the "input" link column grid of the xmlOutput Stage is the real driver of the logic to construct the xml content. It's a bit counterintuitive, but the output link column list is really just the "receiver" of the final created xml.

Ernie