new line charecter in btwn double quotes in sequential file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
sahityab
Participant
Posts: 14
Joined: Fri Jun 24, 2011 7:51 am

new line charecter in btwn double quotes in sequential file

Post by sahityab »

Hi ,
I am having issues with handling new line charecters inbetween the double quotes in a sequenctail file.THe sequencial file has the follwoing setting
RECORD DELIM =UNIX NEW LINE
FILED DELIM=~
QUOTES=double
NULL=*

Here is the example of a record i am trying to handle

2011~~~"'eye infection' written tresaderm131.37
"~"3449 E Pacific Coast Hwy

My understanding is that as long as the filed is varchar and if the value is inbetween the double quotes datastage would not try to resolve the charecters in between the quotes...But datastage is rejecting the records .Can some one please help find a soulution for this
thanks in advance
S
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Parallel Sequential File stage seems to have trouble with these. Prefer a server Sequential File stage, perhaps in a server Shared Container in your parallel job.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

It sounds like the newline takes precedence over the double quote. I had a similar situation in a previous life and used transformer stage variables to concatenate rows like this together. The trick with that option is to be able to detect the situation correctly.
Choose a job you love, and you will never have to work a day in your life. - Confucius
dspxlearn
Premium Member
Premium Member
Posts: 291
Joined: Sat Sep 10, 2005 1:26 am

Post by dspxlearn »

May be, a shell script can do the job to remove the new line characters before the double quotes in the file. Then, this script can be called in the before job subroutine and have the cleaned file ready for the job as a input.
Thanks and Regards!!
dspxlearn
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Why do people think they can change the client's data with impunity?!!

What if these linefeeds are meaningful to the client?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sahityab
Participant
Posts: 14
Joined: Fri Jun 24, 2011 7:51 am

Post by sahityab »

Thank you for all your responses.
I my case we do not want the newline charecters to be there,it was a mistake on the source side that we are getting these charecters...any suggestion on how to remove the new line charecters are greatly aprreciated ....
i really cant use a transformer to do this as the records are getting rejected when trying to load to the seq file stage itself....
I am just concerned about the perofomance of the job to use a server job to do this......please let me know if you have any ideas
THanks
Sahi
sahityab
Participant
Posts: 14
Joined: Fri Jun 24, 2011 7:51 am

Post by sahityab »

Hi Ray,
Can you please explain what you mean by using a server Sequential File stage or using a server Shared Container Do you mean to recomment creating a server job for this...please explain
Thanks
Sahi
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Yes, I recommend using a server Sequential File stage.

Depending on the volume of data, use a server job (small to medium volume, say up to 10 million rows) or a server Shared Container containing just a Sequential File stage in a parallel job if the volume of data is large.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sahityab
Participant
Posts: 14
Joined: Fri Jun 24, 2011 7:51 am

Post by sahityab »

Hi Ray,
thank you for the response...I have not used shared containers earlier can you please explain how to create a shared ocntainer with just seq file in it....i beileve there should be output connection to the link fromt he seq file also...also i have like 10 files each with their own format in which these new line charecters come in .So Can i use one sharec container for all of them or sh ould i be creating one for each....
Thanks
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Here's what I have in mind.

Code: Select all

SharedContainer
+-----------------------------------+
|                                   |
|   SeqFile                Output   |
|   +-----+                +----+   |
|   |     |  ----------->  |    | --------->
|   +-----+                +----+   |
|                                   |
+-----------------------------------+
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sahityab
Participant
Posts: 14
Joined: Fri Jun 24, 2011 7:51 am

Post by sahityab »

Thank you all for your responses...I was able to resolve the issue by removing the newline charecters using macros in the excel....
Post Reply