Page 1 of 1

Trimming trailing space in a delimited text file

Posted: Wed Jun 23, 2010 9:52 pm
by muralisankarr
I have a delimited text file with a spaces appended to all the fields and thus it appear to be a delimited fixed width file. I read the file with delimiter defining the columns. If the actual length of the field is less than the defined column length, then the field will have space appended to them. A sample text is given below

Code: Select all

LOGISTIC_TYPE~LOGISTIC_REF~TRADE_GROUP
MR  ~(747-400).MR-17-122      ~ENGINE
MR  ~(747-400)MR-17-122       ~ENGINE
MR  ~(767).MR-15-151/03       ~ENGINE
MR  ~(767)MR-15-151           ~ENGINE
AMM ~(767/747-400)MM.71-00-00 ~ENGINE
I have 1000 of files and each file has 100-300 columns. Defining the metadata, pointing the NULL field length equivalent to column length and triming the space on the columns (in a transformer) is very tedious. Is there any better way to trim the trailing space in all the columns using a combination of environmental variable and filter program ?

Thanks
MSR

Posted: Wed Jun 23, 2010 11:02 pm
by ray.wurlod
Declare all the fields as VarChar when importing the "table definition". Use a Transformer stage. Use Derivation Substitution to apply a Trim() function to as many VarChar fields as you like with a single operation.

Posted: Thu Jun 24, 2010 3:21 am
by Sainath.Srinivasan
If you want that for all fields, you can try something like

Code: Select all

sed 's/[ ]*~/~/g' yourFileName
Note - above command is not tested.

Posted: Thu Jun 24, 2010 3:37 am
by Sreenivasulu
Remeber to keep the db fields as 'varchar'. If you put 'char' then spaces will get appended

Regards
Sreeni