reading xml files through datastage having dtd

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
cbhutani2
Participant
Posts: 8
Joined: Wed Mar 18, 2009 1:08 am

reading xml files through datastage having dtd

Post by cbhutani2 »

Hi,
I have an xml source with corr. dtd file.
Could someone pls let me know the steps I need to read it in datstage.
Is it possible in a server job or parallel?

Regards,
Charu.
Sumati
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The dtd isn't strictly needed but may be able to provide the metadata you'd want for the XML you need to process. Import the metadata from the dtd (if the importer supports that, don't recall) either that or from the xml directly or possibly use the dtd to build an xsd with a tool like XMLSpy, perhaps.

The metadata will give you the XPath Expressions you'll need to parse the file using the XML Input stage. As for the particulars, in a parallel job follow Ernie's advice and use an External Source Stage to feed in filenames and also read the XML Best Practices document here for everything you always wanted to know about XML in DataStage but were afraid to ask.

Take your best shot at this and then let us know if you have any specific questions or issues.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

:shock: Yikes... no-one there has ever worked with XML, period? Forget about the "with DataStage" part, no experience whatsoever? That's going to make what you need to do pretty darn difficult. And there's no "urgent" here, just volunteers like me posting when and if they can... right now it's my Sunday afternoon and I've got plenty of other things I should be doing instead of checking DSXchange. :?

Most of that XML knowledge you'll need to pick up on your own, we're not here to teach basic concepts like Everything XML, there are plenty of other resources for that out there - dedicated websites, books like the "Dummies" book (they have a nice one for XML), classes, etc etc. We can certainly help with the nuances of "XML with DataStage", of course. That being said...

All things are possible, including reading your file. Ask the client for an xsd rather than the "dtd", the latter is pretty much obsolete and you'll need the former to get an accurate picture of what all could be in the XML. Worst case, point the metadata importer directly at the XML and use that, but if your example doesn't contain a very representative sampling of everything that could be in there, the metadata won't be as correct as it should be - or would be from an xsd, which explains everything. Worst case, to stop the complaint about the dtd (did you put the dtd in the same directory as the xml?), delete this line from your file... or a copy of that file and then import from the copy without it:

<!DOCTYPE PPRDS SYSTEM "mmlc.dtd">

:!: Just saw the private message where all this was repeated. No need for that - questions are for the forums and never get answered privately. Everyone (even searchers arriving in the thread later) learns that way, (hopefully) benefiting from the exchange of information and knowledge.
-craig

"You can never have too many knives" -- Logan Nine Fingers
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

As Craig notes, DTDs have been pretty much replaced by XSD (XML Schema Definitions). Have the client provide an xsd, or as "complete a document as possible". Look on the web too --- there are tools that can generate reasonable xsd's from DTDs.

Ditto to all that Craig said....spend time learning some pure XML first....it will pay off.....then, follow the references noted earlier...there are many....absorb that document for best practices also. I have some things worth noting in my blog url below also.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Post Reply