read column with new line characters

zulfi123786 · Post by **zulfi123786** » Mon Mar 19, 2012 3:19 am

The file has 10 columns out of which one column has new line characters. this column is quoted with double quotes but the record delimiter is also a unix new line character and this is causing the record to be skipped.

I was successfull in reading the file if the record delimiter is changed from new line to another character and specifying that the column is quoted but the same doesnot work when using new line as record delimiter.

This can be done by reading from a server seq file stage but cant go with that. Also i need to preserver the \n characters.

Any ideas would be appreciated

Thanks

prasson_ibm · Post by **prasson_ibm** » Mon Mar 19, 2012 7:28 am

You can remove that new line using below unix command:-

Code: Select all

awk -F"delimiter" 'NF<10{printf("%s",$0);getline;print;next}1'

chulett · Post by **chulett** » Mon Mar 19, 2012 7:36 am

And what if they want to preserve it? Use the Server Sequential File stage in a Server Shared Container.

zulfi123786 · Post by **zulfi123786** » Mon Mar 19, 2012 9:58 pm

the server sequential file stage seems to have this pretty good option "Contains Terminators", not sure why this was not provided in the parallel stage