Removing newline character while reading from sequential fil

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
AmeyJoshi14
Participant
Posts: 334
Joined: Fri Dec 01, 2006 5:17 am
Location: Texas

Removing newline character while reading from sequential fil

Post by AmeyJoshi14 »

Hi Experts,

We are reading .csv file which has multiple new-line character in multiple columns. I tried different options to read these records, but still the records are getting rejected.
Below are the option I tired.. but no luck :( :?
:roll: :roll:
In the Format tab

In the Record level
1. changed the final delimiter to '\n', end, none,null
2. Add the record delimiter string = Dos format

In the Field defaults

1. Add the delimiter string chr(10), char(13)
2. Removed double quotes add it back


Our data file is :

Code: Select all

"ID" , "Comments", "Modified_date","Modified by"
"1", " THIS iS 
BIG COMMENT
AND BIG VALUE","2013-09-20T05:00:00Z" ," TESTER"                            --------> record 1
"2",

"<div><font face="Calibri" size="2" color="black">#nllsd	some big value
and 
more value

","2015-09-20T05:00:00Z" ," TESTER1"                       		    --------> record 2
Plese let me know which option shall I set the read the newline character from this file.

Appreciate all your help!!

Thanks in advance
http://findingjobsindatastage.blogspot.com/
Theory is when you know all and nothing works. Practice is when all works and nobody knows why. In this case we have put together theory and practice: nothing works. and nobody knows why! (Albert Einstein)
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

The sequential file stage in parallel jobs has no tolerance for embedding the record delimiter in the data. In a parallel job, you could read the individual "short" records and piece them together until you arrive at a "whole" record. This of course means defining the input as a single unbounded varchar field.

The sequential file stage in a server job is more tolerant. You could consider a server shared container for reading your input file.

Mike
rkashyap
Premium Member
Premium Member
Posts: 532
Joined: Fri Dec 02, 2011 12:02 pm
Location: Richmond VA

Post by rkashyap »

You may also consider a Unix based solution to fix the file, before processing it in DataStage.

See attached links :
http://www.unix.com/shell-programming-s ... -file.html
http://stackoverflow.com/questions/1483 ... ix-solaris
http://www.unix.com/hp-ux/147184-how-re ... -file.html
AmeyJoshi14
Participant
Posts: 334
Joined: Fri Dec 01, 2006 5:17 am
Location: Texas

Post by AmeyJoshi14 »

Thanks for the help.

I am using the server job to laod this file, but it is giving me error for the some of the records which are having huge xml format. below is the error message :(

Code: Select all

Sequential_File_0.DSLink2: nls_map_buffer_in() - NLS mapping error, row 96338 (approx), row ="<div><font face="Calibri" size="2" color="black"> THIS is huge
comment </font></div>"
Values for the above records are inserted as null :roll:
http://findingjobsindatastage.blogspot.com/
Theory is when you know all and nothing works. Practice is when all works and nobody knows why. In this case we have put together theory and practice: nothing works. and nobody knows why! (Albert Einstein)
Post Reply