Page 1 of 1
Extracting data from flat files
Posted: Thu Oct 19, 2006 1:38 pm
by pradkumar
Hi
I am trying to extract the raw data from a flat file. This is a fixed width (column) file. I am using a sequential file stage to extract.
Now how do I specify that it is fixed width column (like in server edition we have this option in the format tab). I donot know how this can be accomplished in parallel edition.
Thanks for all the help
Posted: Thu Oct 19, 2006 2:00 pm
by kris007
Go to the Format tab in the Sequential file stage and right click on the record level and select format as and you will find it there.
![Wink :wink:](./images/smilies/icon_wink.gif)
Posted: Thu Oct 19, 2006 2:21 pm
by pradkumar
Hi Kris
Thanks for the reply
Sorry to trouble you. I am a novice to data stage and learning and I worked on Server and now trying on parallel.
Actually I did use format tab and set the Record delimiter to DOS format. and also used Record length to be fixed. But it did not work out.
My raw data does not have any delimiters. I was able to do it in server setting fixed width column option.
Thanks
Posted: Thu Oct 19, 2006 2:29 pm
by kris007
It's pretty easy to find and hard to miss. Here again. Once you go to format tab in the Sequential file stage you will properties window on the left hand side with following Properties ---Record Level,Field defaults, Type Defaults. Right click on any one of them. Select or place your mouse on Format as option so that a new context menu appears in which you will find the attribute "Fixed Width Records" which is what you would need to select.
In most cases, fixed width files don't have a delimiter option and you don't need that property in your case. So remove it from there.
Posted: Thu Oct 19, 2006 3:09 pm
by pradkumar
Forgive me for my ignorance in this regard.
I did try the above mentioned method before I posted here and could not get it to work - I got an error stating
unable to initialize importer - ""record_length fixed" (no length given) and record field is variable length""
First off: Is a fixed width
record similar to a fixed-width
record field (column) ?..........because in my case the file I am trying to extract from has Fixed-width columns i.e., record fields have fixed width
2. If there is no option to specify "fixed-width column" option like in the Server edition ; does this mean I should manually give each column a fixed width (by going to column tab>edit row>string type>field width) ?
I do apologize if this does not make much sense but whats normally a very simple procedure in the Server edition just does not seem to work for me in parallel. And this has made me try a lot of wierd methods (like question 2)
![Sad :(](./images/smilies/icon_sad.gif)
Posted: Thu Oct 19, 2006 4:13 pm
by kris007
pradkumar wrote:
First off: Is a fixed width record similar to a fixed-width record field (column) ?..........because in my case the file I am trying to extract from has Fixed-width columns i.e., record fields have fixed width
It's one and the same. A fixed width file has records whose length is fixed and in almost all cases followed by a LineTerminator. If you have set Record Lenght attribute to fixed and defined proper terminator, then I would put my money on the file
not being a fixed width file. The error message which you have provided points in that direction. It says that the record is of variable length. I would suggest you check the input file again. I have seen similar cases where the last couple of columns in the input file are of variable length. If possible can you post couple of lines from your source here to confirm it?
P.S. Did your method(question2) work for you. I doubt it.
Posted: Thu Oct 19, 2006 6:25 pm
by pradkumar
Thanks Kris
Here are few lines from my raw data file ---- there are 10 fields in each record -- given below are 3 records...
Code: Select all
DRACDDrachenblut Delikatessen Sven Ottlieb Order Administrator Walserweg 21 Aachen 52066 Germany 0241-039123 0241-059428
DUMONDu monde entier Janine Labrune Owner 67, rue des Cinquante Otages Nantes 44000 France 40.67.88.88 40.67.89.89
EASTCEastern Connection Ann Devon Sales Agent 35 King George London WX3 6FW UK (171) 555-0297 (171) 555-3373
Posted: Thu Oct 19, 2006 8:08 pm
by ray.wurlod
The easiest way to get this right is as follows. Import the table definition for the sequential file. When on the Format tab, check the "fixed width" check box, then fill in the ten column widths as a comma separated list in the indicated field. Complete the import then load the table definition into a job that needs to read the file. All will be well (provided you got the widths correct).
Posted: Tue Oct 24, 2006 3:31 pm
by splayer
ray, in EE, in sequential file stage, there is no Fixed Width checkbox. My version is 7.5.1A.
pradkumar, a fixed width record can never have variable length columns. You have 1 or more columns of variable length data type in your record. For example, you need to change a varchar to char.
Posted: Tue Oct 24, 2006 3:52 pm
by ray.wurlod
You CAN use fixed width in parallel Sequential File stage. I did one today.
Set the Record Length property to "Fixed" and do not have VarChar data types.
If you import the table definition of a fixed-width text file, then Load onto the Format tab in the parallel Sequential File stage, this happens automatically.
Note that "first line is column names = true" is not permitted when fixed width format is used.