Remove Imbedded line feed in CDC

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ccatania
Premium Member
Premium Member
Posts: 68
Joined: Thu Sep 08, 2005 5:42 am
Location: Raleigh
Contact:

Remove Imbedded line feed in CDC

Post by ccatania »

DataStage 8.5
CDC 6.5
O/S Linux

Using CDC to DS which creates a Sequential file, in the file there is a 4000 char field that has embedded LF and blank lines. The Seq File fields are enclosed in double quotes and are comma delimited, the source for CDC is Oracle.

The problem is that the process aborts in the Seq. File stage due to a format error. The imbedded line feed causes the record to wrap and the first field in the record is a timestamp where due to the line feed now has a text field in the first position. This is an example of the data in the field.
TOTAL PRODUCT PURCHASES
($PER CALENDAR YEAR)",,,"2008-03-01 00:00:00","2011-02-28 00:00:00","inactive","0","1",,,,"1",,"2009-02-06 10:41:16","Integration","2011-03-01 02:00:08","InactivateBatch",,,,
"2011-07-28 19:45:50","0","I","NOT SET ",,,,,,,,,,,,,,,,,,,,,,,,,,"24055","7985","25499","4721","TIER 2 $500,000 < $750,000

You can see how this one field has several line feed imbedded in it.

Now to my question, is there a way in CDC to handle the Line feed, to suppress them. I know there are other ways-maybe, with UNIX or Perl scripts. I was looking to see if CDC can handle this. If anyone has information as how to remove the imbedded LF I would be really like to see how you were able to get it done in CDC.

Thanks
Post Reply