Delimiter Issue in extracting file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
nveejas
Participant
Posts: 11
Joined: Sun Sep 26, 2010 11:42 pm
Location: illinois

Delimiter Issue in extracting file

Post by nveejas »

Hi All,
We have a scenario as below:

Source file:

"1","AAA","BBB","CCC"
"2","AA",A","BBB","CCC"

In the above records the 1st record wil be processed succesfully if we set " and , as delimiter in the sequential file stage but for the second record we have both double quotes(") and comma(,) in the data in 2nd column (highlighted in green), so the record will be dropped. Is there any way to fetch these kind of records through DS job?? I have tried by making either " or , as delimiter, but in both cases the data is truncated or dropped. Kindly requesting ur help fr solving this.
Thanks,
Sajeev N
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

What happens if you change the delimiter to delimiter string and pass delimiter as "," and quote=double?
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You'd stand a better chance of reading this in a Server job or failing that, a Server Shared Container in your PX job with that Sequential File stage in it. It is more forgiving of issues of that nature.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Line 2 is badly-formed. It has an odd number of quote characters. Demand well-formed data from your provider.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
mobashshar
Participant
Posts: 91
Joined: Wed Apr 20, 2005 7:59 pm
Location: U.S.

Post by mobashshar »

One way is to remove all the double quotes in Before-job subroutine and use , as delimiter.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

No, the comma is part of the data in the second field. Doing that would make the second record parse out as five fields instead of four, I'm afraid.
-craig

"You can never have too many knives" -- Logan Nine Fingers
nveejas
Participant
Posts: 11
Joined: Sun Sep 26, 2010 11:42 pm
Location: illinois

@Ray

Post by nveejas »

Ray,

The is generated from CDC when there is an update or insert in source DB2 table. By default the CDC will create files with comma(,) & double quotes(") as delimiter. Is there any way to change the delimiter in CDC?
Thanks,
Sajeev N
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Gotta be. Research it and let us know.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
wwilliamson
Participant
Posts: 21
Joined: Fri Oct 01, 2010 2:45 pm
Contact:

We used a preprocess shell script to strip unwanted newlines

Post by wwilliamson »

Some of our data was apparently coming from web forms with free-form text fields that weren't being sanitized. A shell script was used to parse the file and locate extraneous newlines beforehand. The same approach could be used in this case to locate errant quote characters.
nveejas
Participant
Posts: 11
Joined: Sun Sep 26, 2010 11:42 pm
Location: illinois

@ mobashshar

Post by nveejas »

Thanks mobashshar for the information.

We have fixed this issue by changing the delimiter to pipe(|) from CDC. We have created a PMR with IBM and they provided a java program to change the delimiter to pipe while generating files from CDC. Now its working fine.

Thanks for all your help. !!
Thanks,
Sajeev N
Post Reply