Fields with same name in XML

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
krishna81
Premium Member
Premium Member
Posts: 78
Joined: Tue May 16, 2006 8:01 am
Location: USA

Fields with same name in XML

Post by krishna81 »

Is there any way that we can read read xml file where tagnames coming with same name.i am able to read first XML and populating fields except updatedate fileds.here i want to ppopulate if dateSequence=0 then updateDate should go to firstdate and if dateSequence=1 then second updateDate should go to seconddate(Issue is i am getting updateDate twice)

here is my requirement
I/p
<?xml version="1.0" encoding="UTF-8"?>
<XML>
<product>
<number>123</number>
<color><blue></color>
<location>A</location>
<Dates>
<dateSequence>0</dateSequence>
<updateDate>2010-01-10</updateDate>
<dateSequence>1</dateSequence>
<updateDate>2007-01-10</updateDate>
</Dates>
</product>
</XML>

out put must be:

number color location firstdate seconddate
1 blue A 2010-01-10 2007-01-10

My Design flow is
Extsourcestage--->Xml input-->Transformer---->seqfile
The logic i have used in Tx is
If Lnk_XMLi_Parse_xml_Tfm.dateSequence=0 Then Lnk_XMLi_Parse_Payload_Tfm.updateDate Else "1800-01-01" =firstdate;
If Lnk_XMLi_Parse_xml_Tfm.dateSequence=1 Then Lnk_XMLi_Parse_Payload_Tfm.updateDate Else "1800-01-01"=seconddate;)

But the output i am able to populate is

number color location firstdate seconddate
1 blue A 2010-01-10


Thanks
Kris
Datastage User
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

It's a poorly designed xml document. Would have been nice if the second date was identified as such by its element name....or even better, if each "sequence" contained subelements (or subattributes) of "sequenceNumber" and "updateDate"..... Otherwise, what is represented here is just two "instances" (and thus two rows) of the same type of date [which as you note, is not really correct...they each have unique meaning]. Just because they are "in physical" order doesn't mean much when parsing xml.

Before thinking of a solution, I would ask if this is just a snippet of something larger...could there be three, four, or more of these date field element pairs in order...or are there always just two? ...and are there many many more columns than just these, or is this the finite list?

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
krishna81
Premium Member
Premium Member
Posts: 78
Joined: Tue May 16, 2006 8:01 am
Location: USA

Post by krishna81 »

This is the finest list and we are using same order.
Datastage User
krishna81
Premium Member
Premium Member
Posts: 78
Joined: Tue May 16, 2006 8:01 am
Location: USA

Post by krishna81 »

Is there any way we can handle this situation in datastage.
Datastage User
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Please answer the second paragraph in my note above. How much more complex is it? That will help dictate the best solution. This is easy to handle in DataStage, but will require some additional steps.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
krishna81
Premium Member
Premium Member
Posts: 78
Joined: Tue May 16, 2006 8:01 am
Location: USA

Post by krishna81 »

The data i have posted above is sample but date fields are always just two.
Datastage User
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

ok...then, if it always just two, then I would just grab the whole "Dates" element in your xmlInput Stage output link (have a column called "DateElement" as a varchar 100 or other similar large length and change the xpath in the Description to be /.../.../Dates/ (without the final columns and text() ).

Remove the dateSequence and updateDate columns from the xmlInput Stage's output link.

In a downstream transformer, use whatever function you need to manually pull out the first and second dates appropriately, as they will now both be on the same "row". And if the sequence is always just 1 or 2, this will be solved by an easy "substring" function.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
krishna81
Premium Member
Premium Member
Posts: 78
Joined: Tue May 16, 2006 8:01 am
Location: USA

Post by krishna81 »

Thanks.It worked.After 1st step i did substring logic in transformer.
I am going to mark this as resolved.
Datastage User
Post Reply