Rogue quotation mark appearing mid-file

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
jackdaw
Premium Member
Premium Member
Posts: 16
Joined: Fri Jul 06, 2007 5:32 am

Rogue quotation mark appearing mid-file

Post by jackdaw »

The DS job runs successfully, with no warnings to the log, but when I view one of the output (csv) files in DS I get mrg_all_chips_set_1..noDupes_tmds_spi_file.olk_tmdsspi_noDupes_pif_file: read_delimited() - invalid quotes, row 14874 column rfaind = "N"".

Sure enough in the output csv this is true. But this isn't present in the input csv, nor in another output with the same rows plus some duplicates, which has the same file format and line end characters.

What's going on ?? :?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Hmmm... that's near to impossible to say without eyes on the target. Or more details. :?

What's different about this link from the other output that doesn't have the issue? What kind of transformations are you doing to this field, any? Do other records populate rfaind with a solitary 'N' without issue? It might also help to describe your job design.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

What does the input row l14784 ook like - does it have doubled double quotes correctly represented? Do you specify the double-quote as the quote character for both input and output?
jackdaw
Premium Member
Premium Member
Posts: 16
Joined: Fri Jul 06, 2007 5:32 am

Post by jackdaw »

It shows (in Textpad - it's a csv file) as:

"N""

when it should be

"N"

All other values output for this column are "N"

The double quote character is specified on the input file and output files.

They all have DOS-style line termination (it's the terminal column that's where the problem occurs), but when I change them to UNIX style it makes no difference.

I can't view it in DS because of the error.

Any thoughts ?
ArndW wrote:What does the input row l14784 ook like - does it have doubled double quotes correctly represented? Do you specify the double-quote as the quote character for both input and output?
jackdaw
Premium Member
Premium Member
Posts: 16
Joined: Fri Jul 06, 2007 5:32 am

Post by jackdaw »

Thanks.

The difference is in the constraint - one has duplicates and the other doesn't.

The duplicates are identified by using stage variables to compare the record key of the previous row, and if different to set a value as "NoDupe" or "Dupe". The rows with "NoDupe" are written to the file which has the error (B).

The other (successful (A)) file is constrained differently to have all rows including duplicates.

The constraints are: A:

Code: Select all

upcase(slk_trf_final_rules.pif) <> "PIF"


B:

Code: Select all

svNewPif="NoDupe" and  upcase(slk_trf_final_rules.pif) <> "PIF"
Why would it occur on one row only ?

Puzzled.
chulett wrote:Hmmm... that's near to impossible to say without eyes on the target. Or more details. :?

What's different about this link from the other output that doesn't have the issue? What kind of transformations are you doing to this field, any? Do other records populate rfaind with a solitary 'N' without issue? It might also help to describe your job design.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Is [quote]"N""[quote] value represented that way in your source or the target. If the former, then you don't have a well-formed input file and cannot process it correctly in CSV varying length format. If in the output, then you have exposed a bug in the sequential file write stage. In either case, the representation should be [quote]"N"""[quote]
jackdaw
Premium Member
Premium Member
Posts: 16
Joined: Fri Jul 06, 2007 5:32 am

Post by jackdaw »

Thanks

The source is "N", and the target is "N"", for one row, mid file.

Bizarre ?! What next ?
ArndW wrote:Is
"N""
value represented that way in your source or the target. If the former, then you don't have a well-formed input file and cannot process it correctly in CSV varying length format. If in the output, then you have exposed a bug in the sequential file write stage. In either case, the representation should be
Post Reply