Page 1 of 1

Seq File Delimiter

Posted: Mon May 23, 2005 7:47 am
by reddy
Hello Sirs,

I have a Sequential file with some fields delimiter vertical bar and few fields with vertical and double quotes,How to handle it.

Please advice me.

Thanks
Narasa

Posted: Mon May 23, 2005 7:52 am
by ArndW
Since the sequential file stage only allows one delimiter you can't do it in just one step.

I would write a job that defines this sequential file as having one column and no field delimiters at all. Run this stream through a transform that does a REPLACE of all pipe and single-quote characters into double quote characters. The transform output goes to a sequential file which is written to as one column and no field delimiters.

Then have a stream coming out of this sequential file stage with the double quote character as delimiter and with the correct column definitions. You can also make this file stage a pipe for performance (that way, no data is actually transferred to and from the disk).

This solution assumes that you don't have occurrences of pipe and single-quote in the data itself.

Posted: Mon May 23, 2005 8:38 am
by reddy
ArndW wrote:Since the sequential file stage only allows one delimiter you can't do it in just one step.

I would write a job that defines this sequential file as having one column and no field delimiters at all. Run this stream through a transform that does a REPLACE of all pipe and single-quote characters into double quote characters. The transform output goes to a sequential file which is written to as one column and no field delimiters.

Then have a stream coming out of this sequential file stage with the double quote character as delimiter and with the correct column definitions. You can also make this file stage a pipe for performance (that way, no data is actually transferred to and from the disk).

This solution assumes that you don't have occurrences of pipe and single-quote in the data itself.

Hi Andrew,

Thanks for quick response.

How to convert pipe and double quotes into single quote using a single function.

Thanks for help.
Narasa

Posted: Mon May 23, 2005 8:47 am
by ArndW
Reddy,

the function would be CONVERT("'|","''",In.BigColumn). Please note the string delimiter in the first parameter is a double quote and the second paramater uses a single quote. This function takes all occurrences of ' and | and replaces them with " and " respectively in the string In.BigColumn

Posted: Mon May 23, 2005 10:46 am
by Sainath.Srinivasan
Maybe you also have to replace your quote char to some other value than dquotes.

Posted: Mon May 23, 2005 10:52 am
by amsh76
I agree with Sainath, the best option would be to use a character, thats very unlikely to appear as data in your file. May be ~ or `

Posted: Mon May 23, 2005 11:56 am
by talk2shaanc
Just a feeling, are you sure that you are getting two types of delimeter?
Coz, I have a feeling that it could be Quote Character as double quotes and pipe as delimeter.