Page 1 of 1

XML Output stage repetitive data elements issue

Posted: Tue Feb 17, 2015 3:43 pm
by amit.jaiswal_ATL
Hello All,

I have below values in the source:
COL-A COL-B
2810945 S
2810965 S
2810985 S
4025390 H
4041510 B
4041512 B

I am expecting below XML structure

Code: Select all

<s:LineItems>
	<s:Code>2810945</s:Code>
	<s:Values>
		<s:CodeValue>S</s:CodeValue>
	</s:Values>
</s:LineItems>
<s:LineItems>
	<s:Code>2810965</s:Code>
	<s:Values>
		<s:CodeValue>S</s:CodeValue>
	</s:Values>
</s:LineItems>
<s:LineItems>
	<s:Code>2810985</s:Code>
	<s:Values>
		<s:CodeValue>S</s:CodeValue>
	</s:Values>
</s:LineItems>
<s:LineItems>
	<s:Code>4025390</s:Code>
	<s:Values>
		<s:CodeValue>H</s:CodeValue>
	</s:Values>
</s:LineItems>
<s:LineItems>
	<s:Code>4041510</s:Code>
	<s:Values>
		<s:CodeValue>B</s:CodeValue>
	</s:Values>
</s:LineItems>
<s:LineItems>
	<s:Code>4041512</s:Code>
	<s:Values>
		<s:CodeValue>B</s:CodeValue>
	</s:Values>
</s:LineItems>
But I am getting below result:

Code: Select all

<s:LineItems>
	<s:Code>2810945</s:Code>
	<s:Code>2810965</s:Code>
	<s:Code>2810985</s:Code>
	<s:Values>
		<s:CodeValue>S</s:CodeValue>
	</s:Values>
</s:LineItems>
<s:LineItems>
	<s:Code>4025390</s:Code>
	<s:Values>
		<s:CodeValue>H</s:CodeValue>
	</s:Values>
</s:LineItems>
<s:LineItems>
	<s:Code>4041510</s:Code>
	<s:Code>4041512</s:Code>
	<s:Values>
		<s:CodeValue>B</s:CodeValue>
	</s:Values>
</s:LineItems>
I used below XPATH to generate the data element structure:
COLUMN_CODE ==> /s:LineItems/s:Code
COLUMN_CODEVALUE ==> /s:LineItems/s:Values/s:CodeValue

I tried multiple options but could not generate the expected XML structure. Can you please let me know the process/option to accomplish the desired result?

Thanks in advance!

-Amit

Posted: Tue Feb 17, 2015 4:44 pm
by eostic
I suspect this is an issue that we see once in awhile when there are very few elements and there isn't anything uniquely identifying the lower level repeating instances.

If that's the issue, you should be able to alleviate the problem by adding a unique counter to that level of your hierarchy. Create a dummy column in a transformer upstream, and make it unique for every COL-B level row that comes thru.

Add it to your xml with its own xpath, perhaps with an element name of <dummyKey>, and make it the only repetition element.

Put this column above CodeValue, with xpath something like:

/s:LineItems/s:Values/s:dummyKey/text()


That should put things in the right order.....and then you can clean out the <dummyKey> element downstream in another transformer. Put an output link on your xmlOutput Stage and just have one large column on that link.....something like myXMLContent with longvarchar and a long length....put a single '/' in the Description property.

Ernie

Posted: Tue Feb 17, 2015 8:07 pm
by amit.jaiswal_ATL
Thanks Ernie for your reply. I am just wondering, after adding dummykey element with it's value (counter 1,2,3, etc) in the XML data block, how to cleanup this from that XML block in the subsequent transformer? Are you suggesting to use function like ereplace and replace dummykey elelments with ZERO/empty space?

Thanks.
-Amit

Posted: Tue Feb 17, 2015 8:28 pm
by eostic
One way to do it is via something like ereplace.....

Yet another is to research how the xml is being used. If being consumed by a program that is specifically parsing the xml and looking for the known elements only, and not validating with an xsd, then just ignore the new dummy elements.

Ernie

Posted: Wed Feb 18, 2015 4:02 pm
by amit.jaiswal_ATL
Thanks Ernie. To remove extra data elements in XML because of DummyKey I am looking for any option. So far I could not figure out any solution.

Here is what I have to achieve:
My XML structure
<s:LineItems>
<s:Code>2810945</s:Code>
<s:Values>
<s:DummyKey>123</s:DummyKey>
<s:CodeValue>S</s:CodeValue>
</s:Values>
</s:LineItems>
<s:LineItems>
<s:Code>2810965</s:Code>
<s:Values>
<s:DummyKey>1369</s:DummyKey>
<s:CodeValue>S</s:CodeValue>
</s:Values>
</s:LineItems>

My requirement is to remove <s:DummyKey>123</s:DummyKey>, <s:DummyKey>1369</s:DummyKey> and all other DummyKey elements from this XML block.

Can you please suggest some option to handle this within datastage?

Thanks in advance.

-Amit

Posted: Wed Feb 18, 2015 4:31 pm
by ray.wurlod
Is this XML a single string, or on multiple lines as shown? If the latter, simply use a filter (Filter stage or Transformer stage output link constraint) to prevent transfer of any line beginning with "<s:Dummy".

Posted: Wed Feb 18, 2015 5:22 pm
by amit.jaiswal_ATL
Thanks Ray. Unfortunately, it is a single line XML block. Any solution for this scenario?

Posted: Wed Feb 18, 2015 9:02 pm
by ray.wurlod
You could possibly use the looping capability in a Transformer stage to loop through the elements and suppress the DummyKey elements from being transferred to the output.

Otherwise, of course, a routine to do the same.

Posted: Fri Feb 20, 2015 10:23 am
by amit.jaiswal_ATL
Thanks Ray for your suggestions. I defined the "Value" column (Column-B) as a Key column in the Input of XML Output Stage and it gave me expected result.

Posted: Fri Feb 20, 2015 12:26 pm
by eostic
Cool! I assumed you had "B" as the key all along. Just be careful...this same symptom can occur when you have "B" as the key but you have repeats in the parent, which is the condition I thought you were running into.

Glad you got thru it!

Ernie