XML Output stage repetitive data elements issue

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
amit.jaiswal_ATL
Premium Member
Premium Member
Posts: 28
Joined: Thu Oct 23, 2014 1:49 pm

XML Output stage repetitive data elements issue

Post by amit.jaiswal_ATL »

Hello All,

I have below values in the source:
COL-A COL-B
2810945 S
2810965 S
2810985 S
4025390 H
4041510 B
4041512 B

I am expecting below XML structure

Code: Select all

<s:LineItems>
	<s:Code>2810945</s:Code>
	<s:Values>
		<s:CodeValue>S</s:CodeValue>
	</s:Values>
</s:LineItems>
<s:LineItems>
	<s:Code>2810965</s:Code>
	<s:Values>
		<s:CodeValue>S</s:CodeValue>
	</s:Values>
</s:LineItems>
<s:LineItems>
	<s:Code>2810985</s:Code>
	<s:Values>
		<s:CodeValue>S</s:CodeValue>
	</s:Values>
</s:LineItems>
<s:LineItems>
	<s:Code>4025390</s:Code>
	<s:Values>
		<s:CodeValue>H</s:CodeValue>
	</s:Values>
</s:LineItems>
<s:LineItems>
	<s:Code>4041510</s:Code>
	<s:Values>
		<s:CodeValue>B</s:CodeValue>
	</s:Values>
</s:LineItems>
<s:LineItems>
	<s:Code>4041512</s:Code>
	<s:Values>
		<s:CodeValue>B</s:CodeValue>
	</s:Values>
</s:LineItems>
But I am getting below result:

Code: Select all

<s:LineItems>
	<s:Code>2810945</s:Code>
	<s:Code>2810965</s:Code>
	<s:Code>2810985</s:Code>
	<s:Values>
		<s:CodeValue>S</s:CodeValue>
	</s:Values>
</s:LineItems>
<s:LineItems>
	<s:Code>4025390</s:Code>
	<s:Values>
		<s:CodeValue>H</s:CodeValue>
	</s:Values>
</s:LineItems>
<s:LineItems>
	<s:Code>4041510</s:Code>
	<s:Code>4041512</s:Code>
	<s:Values>
		<s:CodeValue>B</s:CodeValue>
	</s:Values>
</s:LineItems>
I used below XPATH to generate the data element structure:
COLUMN_CODE ==> /s:LineItems/s:Code
COLUMN_CODEVALUE ==> /s:LineItems/s:Values/s:CodeValue

I tried multiple options but could not generate the expected XML structure. Can you please let me know the process/option to accomplish the desired result?

Thanks in advance!

-Amit
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

I suspect this is an issue that we see once in awhile when there are very few elements and there isn't anything uniquely identifying the lower level repeating instances.

If that's the issue, you should be able to alleviate the problem by adding a unique counter to that level of your hierarchy. Create a dummy column in a transformer upstream, and make it unique for every COL-B level row that comes thru.

Add it to your xml with its own xpath, perhaps with an element name of <dummyKey>, and make it the only repetition element.

Put this column above CodeValue, with xpath something like:

/s:LineItems/s:Values/s:dummyKey/text()


That should put things in the right order.....and then you can clean out the <dummyKey> element downstream in another transformer. Put an output link on your xmlOutput Stage and just have one large column on that link.....something like myXMLContent with longvarchar and a long length....put a single '/' in the Description property.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
amit.jaiswal_ATL
Premium Member
Premium Member
Posts: 28
Joined: Thu Oct 23, 2014 1:49 pm

Post by amit.jaiswal_ATL »

Thanks Ernie for your reply. I am just wondering, after adding dummykey element with it's value (counter 1,2,3, etc) in the XML data block, how to cleanup this from that XML block in the subsequent transformer? Are you suggesting to use function like ereplace and replace dummykey elelments with ZERO/empty space?

Thanks.
-Amit
Amit Jaiswal
Atlanta GA USA
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

One way to do it is via something like ereplace.....

Yet another is to research how the xml is being used. If being consumed by a program that is specifically parsing the xml and looking for the known elements only, and not validating with an xsd, then just ignore the new dummy elements.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
amit.jaiswal_ATL
Premium Member
Premium Member
Posts: 28
Joined: Thu Oct 23, 2014 1:49 pm

Post by amit.jaiswal_ATL »

Thanks Ernie. To remove extra data elements in XML because of DummyKey I am looking for any option. So far I could not figure out any solution.

Here is what I have to achieve:
My XML structure
<s:LineItems>
<s:Code>2810945</s:Code>
<s:Values>
<s:DummyKey>123</s:DummyKey>
<s:CodeValue>S</s:CodeValue>
</s:Values>
</s:LineItems>
<s:LineItems>
<s:Code>2810965</s:Code>
<s:Values>
<s:DummyKey>1369</s:DummyKey>
<s:CodeValue>S</s:CodeValue>
</s:Values>
</s:LineItems>

My requirement is to remove <s:DummyKey>123</s:DummyKey>, <s:DummyKey>1369</s:DummyKey> and all other DummyKey elements from this XML block.

Can you please suggest some option to handle this within datastage?

Thanks in advance.

-Amit
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Is this XML a single string, or on multiple lines as shown? If the latter, simply use a filter (Filter stage or Transformer stage output link constraint) to prevent transfer of any line beginning with "<s:Dummy".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
amit.jaiswal_ATL
Premium Member
Premium Member
Posts: 28
Joined: Thu Oct 23, 2014 1:49 pm

Post by amit.jaiswal_ATL »

Thanks Ray. Unfortunately, it is a single line XML block. Any solution for this scenario?
Amit Jaiswal
Atlanta GA USA
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You could possibly use the looping capability in a Transformer stage to loop through the elements and suppress the DummyKey elements from being transferred to the output.

Otherwise, of course, a routine to do the same.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
amit.jaiswal_ATL
Premium Member
Premium Member
Posts: 28
Joined: Thu Oct 23, 2014 1:49 pm

Post by amit.jaiswal_ATL »

Thanks Ray for your suggestions. I defined the "Value" column (Column-B) as a Key column in the Input of XML Output Stage and it gave me expected result.
Amit Jaiswal
Atlanta GA USA
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Cool! I assumed you had "B" as the key all along. Just be careful...this same symptom can occur when you have "B" as the key but you have repeats in the parent, which is the condition I thought you were running into.

Glad you got thru it!

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Post Reply