Hi,
I wanted a recommendation related to a Flat File vs. XML. My company is debating on using an XML vs. flat file as a SOURCE in data stage. The data size will be around 300,000 rows.
I need recommendation on pros and cons of using XML over Flat files. Is using XML is beneficial than flat file in data stage.
Thanks for any advice and time.
XML vs. Flat Files
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Flat file is fastest, if for no other reason than that it's easier to parse the data from it based on metadata definitions. You only need to process flat file metadata once (assuming the format/content doesn't change, which is not always the case!). With XML there's the overhead of processing the tags and verifying them, as well as processing the data. For the small added safety it offers in getting things right, I believe that the cost of using XML over text files is unwarranted. If you have the choice, go for flat files every time! Of course, if you don't have the choice, and your data are being delivered in XML format, then DataStage can handle that, too.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Do we have to write a parser for the XML file in data stage or data stage can handle XML read and write all by itself.
Thanks
Thanks
ray.wurlod wrote:Flat file is fastest, if for no other reason than that it's easier to parse the data from it based on metadata definitions. You only need to process flat file metadata once (assuming the format/content doesn't change, which is not always the case!). With XML there's the overhead of processing the tags and verifying them, as well as processing the data. For the small added safety it offers in getting things right, I believe that the cost of using XML over text files is unwarranted. If you have the choice, go for flat files every time! Of course, if you don't have the choice, and your data are being delivered in XML format, then DataStage can handle that, too.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
DataStage version 7.x has XML reader and XML writer stage types.
Consult on-line help or the relevant manual (these are in your DataStage client install folder, in the Docs sub-folder). The relevant manual in this case is XML PACK Designer Guide (XMLPACK_20_Designer.pdf).
In response to your private message, technically XML files can be transmitted by FTP. They are, after all, still pure text (the tags as well as the data are text). Politically is an entirely different question!
Consult on-line help or the relevant manual (these are in your DataStage client install folder, in the Docs sub-folder). The relevant manual in this case is XML PACK Designer Guide (XMLPACK_20_Designer.pdf).
In response to your private message, technically XML files can be transmitted by FTP. They are, after all, still pure text (the tags as well as the data are text). Politically is an entirely different question!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Ray has already pretty much covered this, just wanted to throw another log of opinion onto the fire. Warning, this is all "IMHO" stuff coming.
I know that XML is all the rage among some circles and I've had people shake their heads and mutter about me being a Luddite for not choosing to use XML at every opportunity. That's the key word to me - choice. If you've got one, stick with flat files. When you need to process XML, either as a Source or Target, then go for it. Study the XML docs and make sure you understand the limitations of the different approaches. Otherwise, stick with something that's generally smaller, more flexible and faster to process - the dreaded sequential file.
I know that XML is all the rage among some circles and I've had people shake their heads and mutter about me being a Luddite for not choosing to use XML at every opportunity. That's the key word to me - choice. If you've got one, stick with flat files. When you need to process XML, either as a Source or Target, then go for it. Study the XML docs and make sure you understand the limitations of the different approaches. Otherwise, stick with something that's generally smaller, more flexible and faster to process - the dreaded sequential file.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: