confounding problem with QUOTES in schema file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
hsahay
Premium Member
Premium Member
Posts: 175
Joined: Wed Mar 21, 2007 9:35 am

confounding problem with QUOTES in schema file

Post by hsahay »

Hi

I am at the end of my wits trying to figure this out. So Help me please.

I have a simple variable length text file that uses pipe (|) as field delimiter and double quotes(") as text qualifier.

Here is a sample record -

"DMX MUSIC"|"Ch 01 S"|"OVERTURE to DER FREISCHUTZ (1817-21)"|"KUHN/STAATSKAPELLE DRESDEN/WEBER"|"CARL MARIA VON WEBER: OUVERTUREN"|"cpo"|"10052 1985"|"cpo RECORDS"|"07/01/2012"|"00:00:00"|"cpo RECORDS"|""|"0018950"

I am using the below schema file to read the text file and my job design is simple -

seq.filestage >>>>>>> odbc_connector stage.


//EO DB DMX detail table load
//Creatd On : 12/2013
//Creatd By : EJF
record nullable
{intact, record_delim='\n', final_delim=end,delim='|', quote=none}
(ServiceName:string[max=20];
Channel:string[max=20];
Title:string[max=2000];
Artist:string[max=2000];
Album:string[max=2000];
JunkRecordLableCode:string[max=20];
SuppCatalogID:string[max=200];
JunkRecordLableName:string[max=2000];
JunkPerfSttDt:string[max=20];
Duration:string[max=20];
LableName:string[max=2000];
LableParent:string[max=2000];
PlaylistFrequency:string[max=200];)

Note that i am using QUOTE = NONE in the schema file.

If i do that it works and the records are parsed correctly however, as expected, the leading and trailing double quotes become part of the data that is loaded in the table. Which is not what i want. I obviously want the data loaded in the table to be free of those infernal double quotes.

So, i change the QUOTE property in the schema file to quotes=double

But now the sequential file stage starts rejecting the row and seems to be unable to parse it.

I replaced all double quotes in my input file with the caret (^) and changed the schema file to say quotes='^' - same result. Record is rejected.

So the question is what do i need to do, what gods i need to pray to, what animal i need to sacrifice in order for me to specify a QUOTE in the schema file ?
vishal
hsahay
Premium Member
Premium Member
Posts: 175
Joined: Wed Mar 21, 2007 9:35 am

Post by hsahay »

ugh ...

Removed INTACT from the schema file and problem solved ....
Marking the thread as resolved.

If anybody has any idea why that solved the problem let me know. The documentation on schema file is very sparse especially on variable length files. :-(
vishal
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You only use the INTACT keyword for partial schemas. Changing quotes to "double" is what fixed the problem in your subject line, removing the "intact" property is what let the fields be parsed properly. How did you end up with that option in the schema file?
-craig

"You can never have too many knives" -- Logan Nine Fingers
hsahay
Premium Member
Premium Member
Posts: 175
Joined: Wed Mar 21, 2007 9:35 am

Post by hsahay »

Chulette

quote=double and quote='^' both work now that i have removed INTACT.

With INTACT i was getting the following error message (see bold) -

^DMX MUSIC^|^Ch 01 S^|^OVERTURE to DER FREISCHUTZ (1817-21)^|^KUHN/STAATSKAPELLE DRESDEN/WEBER^|^CARL MARIA VON WEBER: OUVERTUREN^|^cpo^|^10052 1985^|^cpo RECORDS^|^07/01/2012^|^00:00:00^|^cpo RECORDS^|^^|^0018950^,Field "_r_" with 'delim=end' did not consume entire input, at offset: 11

I put intact there because we will not be using all the fields in the file (those prefixed with "Junk" in the schema file.

I was trying different options trying to understand how schema files work. I am still not quite clear about how the partial schemas are defined. Some examples will be nice but they are hard to come by.

But for now i am good with what i have working.
vishal
Post Reply