Page 1 of 1

Read sequential file with ^M in record

Posted: Wed Jan 13, 2010 8:12 am
by Druid_Elf
Hi all,
I have a request in which we need to process a file deliverd by business (CSV).
I've configured the sequential file stage to "Record delimiter string: DOS" , and the section Field Defaults: "Delimiter= |"

The problem is that the column "process" can have a carriage return, and they also want to see this carriage return in the database to which we need to upload the file.

Example of the file record:
"<NULL>"|"<NULL>"|"False"|"12:00 u"|"<NULL>"|"Access"|"Tabel: Pakketten Sales per channel^M
Excel: C+ migraties in input sheet IDTV"|"<NULL>"|"


In vi i can see that there is also a ^M present in the column itself. I've tried different settings but the record is always send to the reject link with following warning:
Sequential_File_0,0: Input buffer overrun at field "Process"

My question is: Is the sequential file stage able to read records where a column in the file contains a carriage return ? And if so, what settings do you need to use in the stage ?

Thanks for any info you can provide.

Posted: Wed Jan 13, 2010 8:44 am
by dr.murthy
hi,

you can remove that additional characteres at sequential file level by using the vi command like

:%s/[ctrlkey+v and ctrl-key+M]//g
but actual command would be :%s/^V^M//g

Posted: Wed Jan 13, 2010 8:51 am
by Druid_Elf
But doing that would remove the carriage return from the string.
But business still wants to see the carriage return in database.
So
Tabel: Pakketten Sales per channel^M
Excel: C+ migraties in input sheet IDTV"|"<


needs to be stored in the database as
Tabel: Pakketten Sales per channel
Excel: C+ migraties in input sheet IDTV

Posted: Wed Jan 13, 2010 12:52 pm
by Sreenivasulu
When you open a file in unix which has been edited in windows 'Control M' character appears.

Its the 'next line' character.

Use dos2unix command on the file in unix and it will remove these characters like a magic wand

Regards
Sreeni

Posted: Wed Jan 13, 2010 1:25 pm
by chulett
Druid_Elf wrote:Business still wants to see the carriage return in database.

Posted: Wed Jan 13, 2010 2:16 pm
by kandyshandy
I guess you can overcome this issue when you define that particular field as LOB in table... Anyways, DB experts can give you more inputs..
When you have Control-M in LOB fields, DS handles them automatically.

Other way is to use Server job ;)

Posted: Thu Jan 14, 2010 1:36 am
by Druid_Elf
I tried with the server job and my problem is solved :)
Thanks kandyshandy.

In the column definition you can specify the setting "contains terminators", if you set this value to yes, it will keep the CR in the column and the record does not get discarded.

Thanks all for your advice. Appreciate it :D

Posted: Thu Jan 14, 2010 6:33 am
by chulett
Just want to make sure you're aware why that worked. It's because what you actually have there is a CR/LF pair. You see the CR in the data and the record break from the LF terminator. It's the LF that the option allowed you to ignore.