Page 1 of 1

read column with new line characters

Posted: Mon Mar 19, 2012 3:19 am
by zulfi123786
The file has 10 columns out of which one column has new line characters. this column is quoted with double quotes but the record delimiter is also a unix new line character and this is causing the record to be skipped.

I was successfull in reading the file if the record delimiter is changed from new line to another character and specifying that the column is quoted but the same doesnot work when using new line as record delimiter.

This can be done by reading from a server seq file stage but cant go with that. Also i need to preserver the \n characters.

Any ideas would be appreciated

Thanks

Posted: Mon Mar 19, 2012 7:28 am
by prasson_ibm
You can remove that new line using below unix command:-

Code: Select all

awk -F"delimiter" 'NF<10{printf("%s",$0);getline;print;next}1'

Posted: Mon Mar 19, 2012 7:36 am
by chulett
And what if they want to preserve it? Use the Server Sequential File stage in a Server Shared Container.

Posted: Mon Mar 19, 2012 9:58 pm
by zulfi123786
the server sequential file stage seems to have this pretty good option "Contains Terminators", not sure why this was not provided in the parallel stage :cry: