XML Format Issues while generating XML output
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
XML is free format. Your expected and actual results are functionally identical. There are plenty of XML "prettifiers" out there; it is not the purpose of an ETL tool to do that. It is producing valid XML.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Formatting is for peoples, not processes - turn off the formatting and just let it stream it out. When you need to look at it, let something like Internet Explorer automagically pretty it up.
If you really, really think you "need" it to look like you posted, find yourself a pretty print utility for UNIX and hook that into your job flow as a post-process.
If you really, really think you "need" it to look like you posted, find yourself a pretty print utility for UNIX and hook that into your job flow as a post-process.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Premium Member
- Posts: 96
- Joined: Fri Aug 20, 2010 8:51 am
I wrote a Shell Script to remove the Line Breaks in the XML output and passed that Shell Script in After Job Subroutine in job properties.
Without the shell script to Load 1 Million Records, time taken for my job to finish is 1 minute
With the shell script to Load 1 Million Records, time taken for my job to finish is 30 minutes.
I got 2 tables to load, with each table with 550 Million records (total 1.1 Billion records) then the time to take these 2 jobs to finish will be 2 x 225 Hours.
I need to have some solution from datastage itself to gain performance.
Any comments .....
Without the shell script to Load 1 Million Records, time taken for my job to finish is 1 minute
With the shell script to Load 1 Million Records, time taken for my job to finish is 30 minutes.
I got 2 tables to load, with each table with 550 Million records (total 1.1 Billion records) then the time to take these 2 jobs to finish will be 2 x 225 Hours.
I need to have some solution from datastage itself to gain performance.
Any comments .....
....the formal solution is to "uncheck" 'generate formatted output'.......it's just a check box. The point that Ray and Craig are making is that we should never check that box......
....uncheck it and the crlf's will just "go away"...
Ernie
....uncheck it and the crlf's will just "go away"...
Ernie
Ernie Ostic
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
-
- Premium Member
- Posts: 96
- Joined: Fri Aug 20, 2010 8:51 am
Follow Craig's note.
Ernie Ostic
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Please make sure you understand that having everything "in one single line" is the standard and perfectly normal. No system should need it to be formatted and I would challenge your statement that it actually needs to be pretty printed like that.
However, I've already noted how to accomplish that.
However, I've already noted how to accomplish that.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Exactly... the desired goal is to have 'one line'. No wasted bytes...less space on the wire, on disk, etc.
Ernie
Ernie
Ernie Ostic
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
-
- Premium Member
- Posts: 96
- Joined: Fri Aug 20, 2010 8:51 am