Why do we need fixed width columns ?

ady · Post by **ady** » Wed Jan 03, 2007 10:24 am

I have never used fixed width columns format in any of my outputs. I recently took over some jobs which have fixed width columnsas output and inputs.

I dont understand the concept of fixed width and was wondering why we need Fixed width columns?

Can anyone plz explain the use ?

Thnx

Krazykoolrohit · Post by **Krazykoolrohit** » Wed Jan 03, 2007 10:37 am

Some apps like Mainframe produce fixed widht columns as output. You dont have much choice.

Further, Qualitystage only works with fixed width columns.

These are the two places i have used fixed widht columns.

kris007 · Post by **kris007** » Wed Jan 03, 2007 10:39 am

It depends on the requirement. One reason I can think of is you might be passing on that file to some other division for their internal processing or something on those lines. You would need to find out from people around there to get a specific answer.

ady · Post by **ady** » Wed Jan 03, 2007 10:58 am

I think this project is for creating a SAP BW

kris · Post by **kris** » Wed Jan 03, 2007 10:59 am

beaditya wrote:I dont understand the concept of fixed width and was wondering why we need Fixed width columns?

Here is my 10 Cents:

There is not much to understand about the concept of it.
Name itself says it is fixed in width. Makes easy interms of coding for the fixed width layout not only in datastage but in many applications.
The fixed field lengths makes consistently readable and easy to apply validations. Since the layout will have boundary length specifications, application doesn't need to worry about handling length of data larger than expected and easy to apply a validation check on whole file to rule out that the file is bad.

Need for them will depends on which application is going to use and how.
Example: Mainframe application.

Usually once the file an application expecting is fixed, there is very less likely that it will have less data issues like lengths and widths.

There is no need of writing fixed width files unless and until an application interface is expecting you to. Becasue they consume more space on the disc and there is going to be need for applying time consuming operations like trimming leading or trailing spaces while reading data and space filling while writing data.

As they consume extra space and time, its not advisible to write fixed width intermediate files in a process.

There could be more advantages that I couldn't think of, hope others will post their ideas.

Kris~

ray.wurlod · Post by **ray.wurlod** » Wed Jan 03, 2007 4:43 pm

Substring is much faster than delimited field extraction. Therefore most bulk loaders recommend that you prefer fixed width format to delimited format.

bcarlson · Post by **bcarlson** » Wed Jan 10, 2007 11:34 am

If DataStage is importing a fixed-width file you can also gain some performance benefits. With (and only with) fixed-width files you can specify multiple readers. That is, instead of one process reading the file you can have several processes.

If I have 8 CPUs, why have only 1 reading my million record fixed-width file? Even having 2 readers cuts the time in half.

The reason this is limited to fixed-width is that DataStage has to be able to calculate offsets for reading the file. If I have 100 records and 2 readers, then I want the first reader to deal with records 1 thru 50, and reader #2 to deal with records 51 thru 100. In a variable length file, DataStage would actually have to read the file twice to determine where each record is and then start reading the data. That isn't very efficient.

However, with a fixed-width file DataStage knows the overall size of the file and the number of bytes per record. Based on that, DataStage can calculate where each reader should start processing. So if there are 2 readers, then Reader #1 starts at byte 0 (obviously), and reader #2 starts at (overall size / 2 * record length + 1).

Bet that was more than you wanted. Oh well, suffice it to say that even a Unix guy can get some value out of a mainframe idea

Brad.

ray.wurlod · Post by **ray.wurlod** » Wed Jan 10, 2007 2:27 pm

Have you tried multiple readers with delimited text files? Allegedly it became possible (version 7.5.1?).

bcarlson · Post by **bcarlson** » Wed Jan 10, 2007 2:35 pm

We have not tried that yet. I had heard rumors that it was coming, but I thought it was with Hawk.

I will have to give that a try. We don't have any delimited imports (considering 90% of our data comes from the mainframe/fixed-width world). But the delimited files we do have are relatively large, so a faster read is definitely worth it.

I'll let you know if we get it working... Do you know if there are any special rules associated with that functionality, or better yet, is there any updated documentation (hhhehehe) for it?

Brad.

ray.wurlod · Post by **ray.wurlod** » Wed Jan 10, 2007 2:49 pm

Source: private conversation at Information on Demand 2006

Not aware of any documentation.

The mechanism to find the start points for each reader is to position to the, say, 25% point then scan forward until the next record terminator is found. That's the start point for this reader and the end point for the previous reader. Not sure if coordination is through the section leader process or by player processes communicating with each other.

It (obviously, given the above) does not work with data files that lack line terminators.

DSXchange

Why do we need fixed width columns ?

Why do we need fixed width columns ?

Re: Why do we need fixed width columns ?