Page 1 of 2

Sequential file "delimiter not seen" error

Posted: Wed Aug 31, 2011 8:16 am
by PhilHibbs
There are lots of hits in the search for "delimiter not seen", and pretty much all of them boil down to wrong delimiters, or char fields instead of varchar.

My fields are all varchar, they are all nullable, and all delimiters are present as they should be, but still I get this error.

<SFUnifiedData,0> Field "POLLING_DISTRICT_NB" delimiter not seen, offset 14

At first, I assumed that "offset 14" indicated that it was reading the column headings, since the first column name is 14 characters long, so Offset 14 would be the tab between the first and second column names. However, removing the first line and un-selecting "first line is column names" makes no difference. Offset 14 in the file should be well past the POLLING_DISTRICT_NB column which is the second, the first value is 1 character long, the second is three characters, so the tab character at the end of the PDN should be at offset 5 in the record.

Where do I look next?

Posted: Wed Aug 31, 2011 8:22 am
by PhilHibbs
Here is my schema:
Image
Here is a hex dump of my data:

Code: Select all

Offset    00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F  0123456789ABCDEF
--------  -----------------------------------------------  ----------------
00000000  45 4C 45 43 54 4F 52 5F 53 52 43 5F 43 44 09 50  ELECTOR_SRC_CD P
00000010  4F 4C 4C 49 4E 47 5F 44 49 53 54 52 43 54 5F 4E  OLLING_DISTRCT_N
00000020  42 09 45 4C 45 43 54 4F 52 5F 4E 42 09 43 48 47  B ELECTOR_NB CHG
00000030  5F 54 59 50 45 5F 43 44 09 43 4F 4D 49 4E 47 5F  _TYPE_CD COMING_
00000040  4F 46 5F 41 47 45 5F 44 54 09 45 4C 45 43 54 4F  OF_AGE_DT ELECTO
00000050  52 5F 54 49 54 4C 45 09 45 4C 45 43 54 4F 52 5F  R_TITLE ELECTOR_
00000060  46 4F 52 45 5F 4E 4D 09 45 4C 45 43 54 4F 52 5F  FORE_NM ELECTOR_
00000070  4D 49 44 5F 49 4E 49 54 09 45 4C 45 43 54 4F 52  MID_INIT ELECTOR
00000080  5F 53 55 52 5F 4E 4D 09 41 44 44 52 5F 4C 4E 5F  _SUR_NM ADDR_LN_
00000090  31 09 41 44 44 52 5F 4C 4E 5F 32 09 41 44 44 52  1 ADDR_LN_2 ADDR
000000A0  5F 4C 4E 5F 33 09 41 44 44 52 5F 4C 4E 5F 34 09  _LN_3 ADDR_LN_4
000000B0  41 44 44 52 5F 4C 4E 5F 35 09 41 44 44 52 5F 4C  ADDR_LN_5 ADDR_L
000000C0  4E 5F 36 09 50 4F 53 54 43 4F 44 45 09 46 52 41  N_6 POSTCODE FRA
000000D0  4E 43 48 49 53 45 5F 43 44 09 44 4F 42 09 4F 43  NCHISE_CD DOB OC
000000E0  43 5F 53 54 41 54 5F 4F 50 54 4F 55 54 5F 49 4E  C_STAT_OPTOUT_IN
000000F0  0A 31 09 31 32 33 09 31 09 52 09 30 30 30 31 2D   1 123 1 R 0001-
00000100  30 31 2D 30 31 09 09 4A 4F 48 4E 09 09 53 4D 49  01-01  JOHN  SMI
00000110  54 48 09 31 20 4C 4F 4E 44 4F 4E 20 52 4F 41 44  TH 1 LONDON ROAD
00000120  09 57 4F 52 43 45 53 54 45 52 09 09 09 09 09 57   WORCESTER     W
00000130  52 35 20 32 44 4A 09 41 42 43 09 09 4E 0A        R5 2DJ ABC  N

Posted: Wed Aug 31, 2011 9:00 am
by rameshrr3
Whats the delimiter ? Space Character ?? Are Individual Varchar Fields Enclosed in Quotes , and is the quote character set in the seq file properties ?

I would usually read date columns as VarChar - if they come from a file.

Posted: Wed Aug 31, 2011 9:32 am
by PhilHibbs
rameshrr3 wrote:Whats the delimiter ? Space Character ?? Are Individual Varchar Fields Enclosed in Quotes , and is the quote character set in the seq file properties ?

I would usually read date columns as VarChar - if they come from a file.
Data is tab delimited with no quotes. Redefining the Date fields as VarChar 10 makes no difference.

Posted: Wed Aug 31, 2011 10:13 am
by DSguru2B
Open the file with excel. It accepts tab delimited files. see if it opens up the file properly. If it doesn't, that means the file is not properly delimited. Lets start there.

Posted: Wed Aug 31, 2011 10:25 am
by PhilHibbs
There's nothing wrong with the file. Trust me, I've checked it three or four different ways.

Posted: Wed Aug 31, 2011 10:59 am
by chulett
Just for grins, Phil, can you share for us the settings you are using in the Sequential File stage? You've mentioned how your data is delimited and what the file looks like, but not the actual settings you are using... I just don't want to assume anything.

Posted: Wed Aug 31, 2011 11:30 am
by PhilHibbs
ImageImage

Posted: Wed Aug 31, 2011 12:09 pm
by PhilHibbs
One of my colleagues suggested that I should just throw away my metadata and re-load it from the file with an Import Sequential File, and that has worked although it has made my columns all 255 characters long. I am working through them setting everything back to what they were and I'll post if I find what it was that broke it.

Posted: Wed Aug 31, 2011 2:15 pm
by sandeepgs
Hi Phil,

Just trying out with a suggestion as it was reading the file when the metadata was imported from the file then that shows that every thing is VARCAHR and no other data types.

So now take the exact format where you got the error and just change the date field to VARCAHR and test that might work.

If that works the next option would be changing back to date and define the predefined format of the date that you are expected to read. That should work.

Hope the suggestions helps.

Thanks,
Sandeepgs

Posted: Wed Aug 31, 2011 6:42 pm
by ray.wurlod
I trust that you have configured the stage to ignore the first line as containing column headings?

Posted: Wed Aug 31, 2011 9:21 pm
by chulett
I trust you actually read what Phil posted? :wink:

Posted: Wed Aug 31, 2011 9:26 pm
by ray.wurlod
Too quickly, it seems.
:oops:

The original error did seem to suggest that it was trying to read the column heading, since the first tab appears at offset 15.

In the first data record, the first tab appears at offset 2.

Did the error message actually mention a row number?

Posted: Thu Sep 01, 2011 2:28 am
by PhilHibbs
ray.wurlod wrote:Did the error message actually mention a row number?
No, and there's only one row of data, terminated with a Unix line end.

Posted: Thu Nov 15, 2012 6:51 am
by saiwelcomes
I believe the order of the columns in the sequential file stage and the order of the columns in the file are incorrect. That might be the case. Plz check.