XML Format Issues while generating XML output

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

XML is free format. Your expected and actual results are functionally identical. There are plenty of XML "prettifiers" out there; it is not the purpose of an ETL tool to do that. It is producing valid XML.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Formatting is for peoples, not processes - turn off the formatting and just let it stream it out. When you need to look at it, let something like Internet Explorer automagically pretty it up.

If you really, really think you "need" it to look like you posted, find yourself a pretty print utility for UNIX and hook that into your job flow as a post-process.
-craig

"You can never have too many knives" -- Logan Nine Fingers
marpadga18
Premium Member
Premium Member
Posts: 96
Joined: Fri Aug 20, 2010 8:51 am

Post by marpadga18 »

I wrote a Shell Script to remove the Line Breaks in the XML output and passed that Shell Script in After Job Subroutine in job properties.

Without the shell script to Load 1 Million Records, time taken for my job to finish is 1 minute
With the shell script to Load 1 Million Records, time taken for my job to finish is 30 minutes.

I got 2 tables to load, with each table with 550 Million records (total 1.1 Billion records) then the time to take these 2 jobs to finish will be 2 x 225 Hours.

I need to have some solution from datastage itself to gain performance.
Any comments .....
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

....the formal solution is to "uncheck" 'generate formatted output'.......it's just a check box. The point that Ray and Craig are making is that we should never check that box......

....uncheck it and the crlf's will just "go away"...

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
marpadga18
Premium Member
Premium Member
Posts: 96
Joined: Fri Aug 20, 2010 8:51 am

Post by marpadga18 »

Ernie, I tried my job unchecking Generate Formatted Output and ran the job, I got entire output in one single line.
This doesn't works for me. I need the output as shown in my question
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Follow Craig's note.
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Please make sure you understand that having everything "in one single line" is the standard and perfectly normal. No system should need it to be formatted and I would challenge your statement that it actually needs to be pretty printed like that.

However, I've already noted how to accomplish that.
-craig

"You can never have too many knives" -- Logan Nine Fingers
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Exactly... the desired goal is to have 'one line'. No wasted bytes...less space on the wire, on disk, etc.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
marpadga18
Premium Member
Premium Member
Posts: 96
Joined: Fri Aug 20, 2010 8:51 am

Post by marpadga18 »

This line break issue doesn't exist in 7.5 version, but exist in 8.5.
As of now, we wrote a shell script which will delete the line breaks for a XML file.
mail2hfz
Premium Member
Premium Member
Posts: 92
Joined: Thu Nov 16, 2006 8:51 am

Post by mail2hfz »

Hi..This is regarding the XML formating issue..You mentioned that you used the shell script to remove the line breakers..can you share that script
Post Reply