Page 1 of 1

Rogue quotation mark appearing mid-file

Posted: Tue May 20, 2008 6:26 am
by jackdaw
The DS job runs successfully, with no warnings to the log, but when I view one of the output (csv) files in DS I get mrg_all_chips_set_1..noDupes_tmds_spi_file.olk_tmdsspi_noDupes_pif_file: read_delimited() - invalid quotes, row 14874 column rfaind = "N"".

Sure enough in the output csv this is true. But this isn't present in the input csv, nor in another output with the same rows plus some duplicates, which has the same file format and line end characters.

What's going on ?? :?

Posted: Tue May 20, 2008 6:32 am
by chulett
Hmmm... that's near to impossible to say without eyes on the target. Or more details. :?

What's different about this link from the other output that doesn't have the issue? What kind of transformations are you doing to this field, any? Do other records populate rfaind with a solitary 'N' without issue? It might also help to describe your job design.

Posted: Tue May 20, 2008 6:33 am
by ArndW
What does the input row l14784 ook like - does it have doubled double quotes correctly represented? Do you specify the double-quote as the quote character for both input and output?

Posted: Tue May 20, 2008 6:53 am
by jackdaw
It shows (in Textpad - it's a csv file) as:

"N""

when it should be

"N"

All other values output for this column are "N"

The double quote character is specified on the input file and output files.

They all have DOS-style line termination (it's the terminal column that's where the problem occurs), but when I change them to UNIX style it makes no difference.

I can't view it in DS because of the error.

Any thoughts ?
ArndW wrote:What does the input row l14784 ook like - does it have doubled double quotes correctly represented? Do you specify the double-quote as the quote character for both input and output?

Posted: Tue May 20, 2008 7:09 am
by jackdaw
Thanks.

The difference is in the constraint - one has duplicates and the other doesn't.

The duplicates are identified by using stage variables to compare the record key of the previous row, and if different to set a value as "NoDupe" or "Dupe". The rows with "NoDupe" are written to the file which has the error (B).

The other (successful (A)) file is constrained differently to have all rows including duplicates.

The constraints are: A:

Code: Select all

upcase(slk_trf_final_rules.pif) <> "PIF"


B:

Code: Select all

svNewPif="NoDupe" and  upcase(slk_trf_final_rules.pif) <> "PIF"
Why would it occur on one row only ?

Puzzled.
chulett wrote:Hmmm... that's near to impossible to say without eyes on the target. Or more details. :?

What's different about this link from the other output that doesn't have the issue? What kind of transformations are you doing to this field, any? Do other records populate rfaind with a solitary 'N' without issue? It might also help to describe your job design.

Posted: Tue May 20, 2008 7:14 am
by ArndW
Is [quote]"N""[quote] value represented that way in your source or the target. If the former, then you don't have a well-formed input file and cannot process it correctly in CSV varying length format. If in the output, then you have exposed a bug in the sequential file write stage. In either case, the representation should be [quote]"N"""[quote]

Posted: Tue May 20, 2008 7:32 am
by jackdaw
Thanks

The source is "N", and the target is "N"", for one row, mid file.

Bizarre ?! What next ?
ArndW wrote:Is
"N""
value represented that way in your source or the target. If the former, then you don't have a well-formed input file and cannot process it correctly in CSV varying length format. If in the output, then you have exposed a bug in the sequential file write stage. In either case, the representation should be