How load Sequential File with data that had multiline values

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
MrBlack
Participant
Posts: 125
Joined: Wed Aug 08, 2012 8:57 am

How load Sequential File with data that had multiline values

Post by MrBlack »

I'm trying to figure out how to load a CSV that has multiline values within the data columns. Example, here's 3 rows of sample data:

Code: Select all

"Col1","Col2","Col3"
"first row","really long text","more data"[CF/LF]
"second row","example of breaks in the data[CF/LF]
that even though I'm on a new line,[CF/LF]
i'm still the same column of data","more data"[CF/LF]
"third row","really long text","final row"[CF/LF]
The line terminators are dos format [CF/LF] both at the end of the file line and within the multiline field. Quotes and commas between columns. I've tried so many settings in the Format tab section of the stage that my mind is swimming trying to think what various combinations that I have or haven't tried yet.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Use the Server Job Sequential File stage, possibly within a server Shared Container. This stage type has the ability to handle internal line terminators (in quoted fields) out of the box.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ShaneMuir
Premium Member
Premium Member
Posts: 508
Joined: Tue Jun 15, 2004 5:00 am
Location: London

Post by ShaneMuir »

What are you using to open you csv file?

I don't think the issue with with you writing your output, but rather with the program reading the file.

If you want to combine the data over multiple lines into a single column, then you need to find a way to do that prior to your output. If you require those fields to be as a multi row column in your output then you would need to put in dummy column values at each line break.

Either way you need to identify the line breaks in your data and replace or remove them accordingly prior to your output.
MrBlack
Participant
Posts: 125
Joined: Wed Aug 08, 2012 8:57 am

Post by MrBlack »

Ray's answer is as I feared. I used to only be a on a server edition so I wrote a linux command to sFTP the csv and then used a server sequential file job to load the data. I now have a enterprise license available so i was hoping to take advantage of the parallel FTP stage to do native sFTP but it's choking on this multi-line data. I guess I'll just stick with my old server job that is working but just not as modern.
Post Reply