Complex XML output generation

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Paul M
Participant
Posts: 19
Joined: Mon Nov 13, 2006 11:11 am

Complex XML output generation

Post by Paul M »

I'm trying to write complex XML output using the XML output stage. The information comes from different tables (customer, address, contacts = phone numbers). Since addresses and telephone numbers can be mutiple for 1 customer, they must appear in repeating groups in the XML.
My question: how do I produce such an output? Given 1 customer, I could do a lookup for the addresses, but then I could get multiple results in the lookup, how do I merge them into the main customer flow to get 1 XML output message? Anyone with a good suggestion?

Thanks in advance, Paul Mulder
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Have you read the XMLPACK_20_Designer.pdf document yet?
-craig

"You can never have too many knives" -- Logan Nine Fingers
Paul M
Participant
Posts: 19
Joined: Mon Nov 13, 2006 11:11 am

Post by Paul M »

Craig, thanks for your answer. Yes, I've read the XML pack designer guide. It seems the input must be delivered in denormalized format, i.e. columns with repeated content for XML elements at a higher level, and columns with detailed content at a lower level.
But how must this input be created, then? It seems to be a join in a 1:n relationship, but how do you deal with joins to 2 different tables, like in my example of customer with n addresses and m phonenumbers?
Maybe I should change the way I look for a solution; in that case please give me a hint. Thanks, Paul
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Interesting... memory is a harsh mistress. I thought all the tips we got on how to handle situations like yours was in the pdf document I mentioned. Now that I actually go back through it, I see that wasn't the case. :?

Going back through my collection of DataStage flotsam and jetsam, I found that what I was thinking of is a 2003 Ascential document by Hernando Borda called Using the XML PACK - Best Practices which came in zip file with a number of exported jobs and xml file samples. It documents the techniques that we've been using to great effect. Hopefully I can find someplace here to post it.

Basically, it involves creating 'chunks' of xml and loading them into hashed files keyed at the proper level. In your case, there would be one for addresses and another for phone numbers. Then your main processing job can stream keys in, pick those chunks up from the hashed files, and then generate the output you need. Not sure if that's enough of a 'hint' to get you going. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
thebird
Participant
Posts: 254
Joined: Thu Jan 06, 2005 12:11 am
Location: India
Contact:

Post by thebird »

Paul,

I think your requirement is to get Address and Telephone number as repeating elements in the XML. This has been explained on page 4-16 and 17 of the XMLPACK_20_Designer document.

For Example - To enable the <Address> element to repeat within the <Customer> node, you need to define this (that of Address) XPath expression as the repetition path. To do
this, mark the associated input column as the key in the XML Output stage.
If there are multiple Addresses/Phone numbers for a particular Customer, then you would need to pivot these before sending it to the XML Output stage.

In this manner, you would get multiple addresses/phone numbers as repeating in a single Customer node.

Hope this helps.

The Bird
Paul M
Participant
Posts: 19
Joined: Mon Nov 13, 2006 11:11 am

Post by Paul M »

Craig, thanks again for your fast reply. If you could find the 2003 document, I would be very grateful if you could send this to me: paul.mulder@nl.ibm.com

TheBird, thank you for your reply too, I know how to handle this part now, but then how do you generate this kind of input, like I mentioned in one of the earlier postings (see above). Craig suggested another approach described in a document from 2003, which I hope to receive.
Post Reply