Which is the better File Format

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
kailas
Participant
Posts: 21
Joined: Mon Nov 17, 2008 11:49 pm
Location: bangalore

Which is the better File Format

Post by kailas »

Hi ,

Which is the preffered or better File Format for Files . Is it Fixed Width or Delimited ..??
bart12872
Participant
Posts: 82
Joined: Fri Jan 19, 2007 5:38 pm

Post by bart12872 »

In my opinion there is no better file format.
Both can be used.

my preference goes to delimited format. the debugging is easier.

bart
Last edited by bart12872 on Wed Jul 29, 2009 4:40 am, edited 1 time in total.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

It is like asking which is the better vehicle - a Porsche or a Chevy van. The answer will depend upon what you want to do with it.

Fixed width files are great for parallel processing (finding the 100th line is a simple matter of mutiplying line length times rows). This quick direct access is not possible with variable length files. Delimited files are smaller since there is no padding done and, as bart as already noted, they are easier to read manually.
keshav0307
Premium Member
Premium Member
Posts: 783
Joined: Mon Jan 16, 2006 10:17 pm
Location: Sydney, Australia

Post by keshav0307 »

The one which work better for the condition.
Last edited by keshav0307 on Wed Jul 29, 2009 8:19 pm, edited 1 time in total.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

keshav0307 wrote:The one which work better for the codition.
I have no idea what "codition" means. Can you please clarify this term, ideally with a reference to a dictionary website or similar?

Meanwhile, from a perfomance perspective (and depending absolutely on how good the programmer is) I would prefer fixed width, because extracting/replacing fields is a simple substring (memory move) activity, whereas a delimited format necessitates a scan counting delimiters.

As a "hint" DataStage "remembers" the most recently encountered delimiter where the delimiters are @FM but not otherwise.

Ultimately, though, the best - and only - format for source files is the one that is right for the files "they" send to you.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
keshav0307
Premium Member
Premium Member
Posts: 783
Joined: Mon Jan 16, 2006 10:17 pm
Location: Sydney, Australia

Post by keshav0307 »

very bad Ray. you don't have any idea.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That's why I quoted the entry before you edited it! :wink:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
keshav0307
Premium Member
Premium Member
Posts: 783
Joined: Mon Jan 16, 2006 10:17 pm
Location: Sydney, Australia

Post by keshav0307 »

MS word has suggested word and auto correct option, can similar thing be implemented here, so that the typo error can be reduced.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Google Toolbar spell checker, that's what me use online. Too bad not grammar checker two. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply