Page 1 of 1

Extracting data from flat files

Posted: Thu Oct 19, 2006 1:38 pm
by pradkumar
Hi

I am trying to extract the raw data from a flat file. This is a fixed width (column) file. I am using a sequential file stage to extract.

Now how do I specify that it is fixed width column (like in server edition we have this option in the format tab). I donot know how this can be accomplished in parallel edition.

Thanks for all the help

Posted: Thu Oct 19, 2006 2:00 pm
by kris007
Go to the Format tab in the Sequential file stage and right click on the record level and select format as and you will find it there. :wink:

Posted: Thu Oct 19, 2006 2:21 pm
by pradkumar
Hi Kris

Thanks for the reply

Sorry to trouble you. I am a novice to data stage and learning and I worked on Server and now trying on parallel.
Actually I did use format tab and set the Record delimiter to DOS format. and also used Record length to be fixed. But it did not work out.


My raw data does not have any delimiters. I was able to do it in server setting fixed width column option.


Thanks

Posted: Thu Oct 19, 2006 2:29 pm
by kris007
It's pretty easy to find and hard to miss. Here again. Once you go to format tab in the Sequential file stage you will properties window on the left hand side with following Properties ---Record Level,Field defaults, Type Defaults. Right click on any one of them. Select or place your mouse on Format as option so that a new context menu appears in which you will find the attribute "Fixed Width Records" which is what you would need to select.

In most cases, fixed width files don't have a delimiter option and you don't need that property in your case. So remove it from there.

Posted: Thu Oct 19, 2006 3:09 pm
by pradkumar
Forgive me for my ignorance in this regard.

I did try the above mentioned method before I posted here and could not get it to work - I got an error stating unable to initialize importer - ""record_length fixed" (no length given) and record field is variable length""

First off: Is a fixed width record similar to a fixed-width record field (column) ?..........because in my case the file I am trying to extract from has Fixed-width columns i.e., record fields have fixed width

2. If there is no option to specify "fixed-width column" option like in the Server edition ; does this mean I should manually give each column a fixed width (by going to column tab>edit row>string type>field width) ?

I do apologize if this does not make much sense but whats normally a very simple procedure in the Server edition just does not seem to work for me in parallel. And this has made me try a lot of wierd methods (like question 2) :(

Posted: Thu Oct 19, 2006 4:13 pm
by kris007
pradkumar wrote: First off: Is a fixed width record similar to a fixed-width record field (column) ?..........because in my case the file I am trying to extract from has Fixed-width columns i.e., record fields have fixed width
It's one and the same. A fixed width file has records whose length is fixed and in almost all cases followed by a LineTerminator. If you have set Record Lenght attribute to fixed and defined proper terminator, then I would put my money on the file not being a fixed width file. The error message which you have provided points in that direction. It says that the record is of variable length. I would suggest you check the input file again. I have seen similar cases where the last couple of columns in the input file are of variable length. If possible can you post couple of lines from your source here to confirm it?

P.S. Did your method(question2) work for you. I doubt it.

Posted: Thu Oct 19, 2006 6:25 pm
by pradkumar
Thanks Kris

Here are few lines from my raw data file ---- there are 10 fields in each record -- given below are 3 records...

Code: Select all

DRACDDrachenblut Delikatessen                Sven Ottlieb                  Order Administrator           Walserweg 21                                      Aachen                                       52066          Germany             0241-039123         0241-059428   
      
DUMONDu monde entier                         Janine Labrune                Owner                         67, rue des Cinquante Otages                      Nantes                                       44000          France              40.67.88.88         40.67.89.89    
     
EASTCEastern Connection                      Ann Devon                     Sales Agent                   35 King George                                    London                                       WX3 6FW        UK                  (171) 555-0297      (171) 555-3373     

Posted: Thu Oct 19, 2006 8:08 pm
by ray.wurlod
The easiest way to get this right is as follows. Import the table definition for the sequential file. When on the Format tab, check the "fixed width" check box, then fill in the ten column widths as a comma separated list in the indicated field. Complete the import then load the table definition into a job that needs to read the file. All will be well (provided you got the widths correct).

Posted: Tue Oct 24, 2006 3:31 pm
by splayer
ray, in EE, in sequential file stage, there is no Fixed Width checkbox. My version is 7.5.1A.

pradkumar, a fixed width record can never have variable length columns. You have 1 or more columns of variable length data type in your record. For example, you need to change a varchar to char.

Posted: Tue Oct 24, 2006 3:52 pm
by ray.wurlod
You CAN use fixed width in parallel Sequential File stage. I did one today.

Set the Record Length property to "Fixed" and do not have VarChar data types.

If you import the table definition of a fixed-width text file, then Load onto the Format tab in the parallel Sequential File stage, this happens automatically.

Note that "first line is column names = true" is not permitted when fixed width format is used.