Page 1 of 1

writing to XML file

Posted: Thu Jan 25, 2007 12:50 am
by vij
Hi all,

I have a job which is like this:
source dataset ---> transformer stage ---> target XML file
The dataset has about 100,000 records and writing in to XML file is very slow. If i replace the XML file / sequential file, it would be faster. I would want to know is there any thing I miss when I am writing to XML or DataStage is slow writing to XML file?

Thanks in advance!

Posted: Thu Jan 25, 2007 1:41 am
by lstsaur
You need an XML Output Stage to write out the XML to your target file.

Posted: Thu Jan 25, 2007 1:51 am
by vij
I am also using the same.

Posted: Thu Jan 25, 2007 2:27 am
by ray.wurlod
Quantify "very slow" and "faster" without using rows/sec which, as I have established elsewhere, is an essentially meaningless metric. What volume of characters is written into a text file versus what volume of characters is written into the XML file? How much extra processing is needed to create the XML tags?

Posted: Thu Jan 25, 2007 3:06 am
by vij
For 146900 records, it takes 9 minutes and theres no logic involded in the transformerstage, it just converts the decimal column values to string. Transformer stage was here because, in the generated XML file the decimal column values are not appearing, if the datatype is decimal.

Posted: Thu Jan 25, 2007 3:08 am
by ray.wurlod
... and in the case of sequential file output, preserving the same transformations so that the test is fair?

Posted: Thu Jan 25, 2007 3:33 am
by vij
yes, it took just a minute! so, i have about 8 minutes difference, for the same number of records.

Posted: Thu Jan 25, 2007 7:40 am
by chulett
Apples and Oranges. The XML parsing is Java under the covers (Xerces based, IIRC) and not all that speedy. Be happy with your 9 minutes, there is plenty of 'logic' involved.

Posted: Thu Jan 25, 2007 1:51 pm
by pavankvk
xmlout stage is sequential by default. try enabling parallel mode. but it may screwup up ur xml generation logic based on some keys. give a try.

Posted: Thu Jan 25, 2007 7:02 pm
by eostic
Craig is right. Be happy. XML is going to be inherently slower than anything else, especially pure sequential stage functionality.