Extracting XML input with multiple repetative key elements

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Narayan Reddy
Premium Member
Premium Member
Posts: 5
Joined: Thu Oct 20, 2011 1:35 am
Location: Bangalore

Extracting XML input with multiple repetative key elements

Post by Narayan Reddy »

Hi,

I have requirement to extract the xml file which looks like

<Start>
<SeqNo>12</SeqNo>
<DocRef>
<Eur_DocRef>A</Eur_DocRef>
<Eur_DocIdent>1</Eur_DocIdent>
<Eur_DocRef>B</Eur_DocRef>
<Eur_DocIdent>2</Eur_DocIdent>
<Eur_DocRef>C</Eur_DocRef>
<Eur_DocIdent>3</Eur_Docdent>
</DocRef>
</Start>


We have taken Eur_DocRef and Eur_DocIdent in separate link with a single key as repetative key element in XML input stage
and funneling all the data and remove duplicate based on all elements.

External_Source--->XML Input--(2 links for each repetitive key element)--> funnel-->remove duplicate--->sequential file

The output we got looks like

SeqNo Eur_Docref Eur_DocIdent
12 A 1
12 A 2
12 A 3
12 B 1
12 C 1

What we are expecting is

SeqNo Eur_Docref Eur_DocIdent
12 A 1
12 A 1
12 A 1
12 B 2
12 C 3


please help us on this


Regards,
Reddy
Reddy
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Do you know how to specify xpath and repeating elements? If not, a search here and in Ernie's blog will be an excellent learning experience.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Narayan Reddy
Premium Member
Premium Member
Posts: 5
Joined: Thu Oct 20, 2011 1:35 am
Location: Bangalore

Post by Narayan Reddy »

I have tried with repetitive key element but i am not getting the expected output.

I have gone through blogs and tried but no use.

Regards,
Reddy
Reddy
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

I see nothing natural in that xml that would produce the output that you say you are expecting.

"A" has no priority or parentage in that structure...it's simply another instance of docref.

I would hope and expect that you should get:

for the docref link:

12 A
12 B
12 C

...three rows.


and for the other link:

12 1
12 2
12 3


....then the rest is up to you. Pretend it was just two relational tables with 3 rows each. how do you want to combine them? There is nothing inherent in that xml structure that defines their relationship....and if there "should be" then it's a fairly poor xml structure.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
djbarham
Participant
Posts: 34
Joined: Wed May 07, 2003 4:39 pm
Location: Brisbane, Australia

Post by djbarham »

eostic wrote:I see nothing natural in that xml that would produce the output that you say you are expecting.
I agree with Ernie. The XML does not look right. Can you post the XSD?
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

at quick glance, it looks like your example. The xml is perfectly valid, and probably the xsd is too.

replaced by, ammended by, corrected by (others also)--- each of these are simply independent repeating elements.....

retreive them as I noted above in your original example. On separate links.... then decide (perhaps there are other keys or higher level element and attribute info on each row to help your decision?) how to combine them downstream.

The parsing is simple. multiple output links.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Post Reply