read column with new line characters

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

read column with new line characters

Post by zulfi123786 »

The file has 10 columns out of which one column has new line characters. this column is quoted with double quotes but the record delimiter is also a unix new line character and this is causing the record to be skipped.

I was successfull in reading the file if the record delimiter is changed from new line to another character and specifying that the column is quoted but the same doesnot work when using new line as record delimiter.

This can be done by reading from a server seq file stage but cant go with that. Also i need to preserver the \n characters.

Any ideas would be appreciated

Thanks
- Zulfi
prasson_ibm
Premium Member
Premium Member
Posts: 536
Joined: Thu Oct 11, 2007 1:48 am
Location: Bangalore

Post by prasson_ibm »

You can remove that new line using below unix command:-

Code: Select all

awk -F"delimiter" 'NF<10{printf("%s",$0);getline;print;next}1'
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

And what if they want to preserve it? Use the Server Sequential File stage in a Server Shared Container.
-craig

"You can never have too many knives" -- Logan Nine Fingers
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

the server sequential file stage seems to have this pretty good option "Contains Terminators", not sure why this was not provided in the parallel stage :cry:
- Zulfi
Post Reply