Limitation of DataStage to Process XML data?

snt_ds · Post by **snt_ds** » Sun Sep 28, 2008 8:24 pm

Hi All,
Does any body having particular experience about limitation of DataStage to process XMS?
This might be generalized question, but we are exploring different options to process XML data.
I searched thru forums but couldnt find proper information.

Thanks for your time.

ray.wurlod · Post by **ray.wurlod** » Sun Sep 28, 2008 9:18 pm

The forums are not the place to search for documentation. You may have the documentation that was delivered with the software (that is, all the manuals). If not, request your support provider to get them for you.

I can't help with the actual question, having no experience with XMS.

chulett · Post by **chulett** » Sun Sep 28, 2008 9:29 pm

XMS or XML? You've mentioned both, which is it?

chulett · Post by **chulett** » Mon Sep 29, 2008 7:56 am

I'm not really aware of any 'limitation' per se, what kind of concerns are you having? Have you gone through the XML PACK documentation in your Docs directory? Have you gone through the XML Best Practices document that Kim Duke is hosting for us on his website?

That would be where you should start.

chulett · Post by **chulett** » Mon Sep 29, 2008 12:46 pm

Hopefully Ernie will see the signal.

eostic · Post by **eostic** » Mon Sep 29, 2008 5:10 pm

Cool image. The kids are playing "lego batman" on the Wii..... nice. XML limitations. Too broad a subject. Let's get specific. What are you trying to accomplish?

Ernie

lstsaur · Post by **lstsaur** » Tue Sep 30, 2008 12:09 pm

If you know how to use Java to proceess XML documents, especially how to use StAX, don't even bother to use the Datastage. Datastage is using DOM based technique, if the XML document is larger than 2 GB, then DS aborts.

eostic · Post by **eostic** » Tue Sep 30, 2008 2:16 pm

A good point lstsaur makes. Size. Something to certainly consider, as DS uses XSLT in the internals, and this requires that the document be loaded entirely into memory. This has advantages, but at the same time limits the overall size. Then again, many XML applications have far less volume than this in a "single" document.

Skills is another consideration. If you already have the Java skills, and not the DS skills, it may not even matter. If the team doing the development is mostly DS and not Java, then you have something else to consider.

What are you "doing" with the XML after you get it? This is another significant consideration. Are you reading it to write to an rdbms? Most relationally based ETL tools do what is often referred to as "shredding" the document --- into rows and columns. DataStage does this as well as anything, and if you are already working with said rdbms targets and other sources, you can hopefully re-use parts of your logic.

If you are "writing" XML, it's a bit more tricky. Of course, if DataStage is already being used to read from a whole host of sources, it still may be the best candidate --- but if the XML has a very complex hierarchy, it can get very complicated in DS... do-able, but you'll need some strong DataStage skills.

If you are reading a complex hierachical XML document, with the sole purpose of then writing a different, alternative complex hierarchical XML document, TX may be a better option, depending on the complexity of the document and/or the existence of other DataStage skills and assets, and the number of such Transformations you need to build.

Ernie

snt_ds · Post by **snt_ds** » Tue Sep 30, 2008 7:35 pm

Thanks Ernie/lstsaur.

chulett · Post by **chulett** » Tue Sep 30, 2008 9:17 pm

eostic wrote:A good point lstsaur makes.

Ah... Ernie Ostic, the Yoda of DSXchange.

snt_ds · Post by **snt_ds** » Wed Oct 01, 2008 10:54 am

eostic wrote:What are you "doing" with the XML after you get it?

We are writing to RDBMS.