Page 1 of 1

Quote character within a quoted text field

Posted: Fri Oct 23, 2009 6:30 am
by dohertys
Hi ,
I've got a comma separated file with character fields quoted with "s.

I've found some records where there is a section within a character field that is quoted with 2 double quotes.

e.g.
1,"some field1","some field with a ""quoted section"" in it",2

This seems to cause my datastage to assume that its a new field when it sees the "" characters.

Anyone got any ideas how I should handle that ?

Why does datastage treat the " character as though it signifies a new field ( when the file is using comma as the delimiter).?

Posted: Fri Oct 23, 2009 7:08 am
by chulett
Because it does. [shrug]

Have you tried using a Server job for that, they are more... forgiving.

Posted: Fri Oct 23, 2009 7:35 am
by dohertys
A good plan, but I've got to use our 'standard' jobs. (We've got a generic bit of px code to take any file and load it into a table with the same format).


I think I might stick a unix script with a sed command to remove "" from the file...unless anyone's got any other ideas.

Is there anything that can be set in the orchestrate schema definition for the input file layout which would handle these double quotes?

Posted: Fri Oct 23, 2009 7:55 am
by betterthanever
If have control over the source files you were reading, i would suggest to get them tab delimited so that you will not have these issues with quotes...

Posted: Fri Oct 23, 2009 7:59 am
by chulett
That won't help, it's not the field delimiter that's the issue, it's the fact that string fields are double-quoted and then have them inside as well. You could try removing that property from the field and then trim the quotes from inside the job, I would think.

Posted: Fri Oct 23, 2009 8:02 am
by betterthanever
i agree..presumed problem with a ',' and replied for a "" ..my bad.. avoiding quotes on source side will help..

Posted: Fri Oct 23, 2009 10:57 am
by ray.wurlod
In which case set the Quote character in the format to none.