handling a (sort of) fixed length file

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
hobocamp
Premium Member
Premium Member
Posts: 98
Joined: Thu Aug 31, 2006 10:04 am

handling a (sort of) fixed length file

Post by hobocamp »

Wondering if anyone has run into anything similar to this -

We need to bring in a new vendor file through ETL. It's basically a fixed length file, with the exception that any empty or null fields have been replaced by a tilde (~) in a one-byte field. So of course a tilde could represent anything from 1 to X bytes. Any suggestions would be appreciated.

Thanks in advance.

Tom
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Then you have to treat it as not fixed width. Read it as a single VarChar column, and parse it within a Transformer stage. Or pre-process the file to insert delimiter characters, using either a shell script or a BASIC routine.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
hobocamp
Premium Member
Premium Member
Posts: 98
Joined: Thu Aug 31, 2006 10:04 am

Post by hobocamp »

Ray -

Thanks for your response.

I thought it might involve reading each row as a single string.

As far as parsing within a transformer, do you have any suggestions how this might be done? I'm having trouble coming up with how I would keep my place in the row, and match that to the file schema. For example, once the first tilda is found, the record length is shortened by FieldLength -1, and each successive tilda (if any) will do the same thing. (Sorry if this description is fuzzy - I'm having almost as much trouble describing it as thinking about it.)

Thanks again.
Tom
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

So... you've got records with no field delimiters but not all fields are fully populated? Who comes up with crap formats like this? :evil:

I don't see how you would do something like this outside of a C or C++ program. Or, I guess, a BASIC routine that does the same, but I imagine that would be... slower. Stream in a character at a time and decide what to do with it based on record number, current position and if it is a tilde or not. I'd probably try to build a 'normal' delimited file (or record in a routine) from that output and then process it... well, normally.

I know what my first reaction would be if someone tried to pawn this off on me - forget it. Push back. Insist on a proper fixed-width file with no 'sort of' involved or get a proper delimited file. Easier said then done, I know, but I would certainly explore that as my first 'option'. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Unless you have a column separator character you won't be able to do this without a lot of work. You would then need to know the starting position of each column and match that to the position of the tilde; then insert the appropriate number of dummy characters to shift everything to the right of the tilde to the correct position. If multiple tildes exist per line you would need to put this into a loop.
hobocamp
Premium Member
Premium Member
Posts: 98
Joined: Thu Aug 31, 2006 10:04 am

Post by hobocamp »

Thanks everyone for your suggestions. I was able to come up with a Basic routine to handle the file. I use a counter as a positional reference to keep track of where I am in the file. It's not really reusable (we have several of this file type and I'll have to customize the routine for each file), but at least it works!

Thanks again.

I'll mark this thread as Resolved.

Tom
Post Reply