Page 1 of 1

Leading Question Mark Appears on First row of data

Posted: Mon Oct 19, 2015 7:13 pm
by Kel
Hi,

I have a file which is encoded in UTF-8 and has EOL = CR/LF.
When we are reading this file on the FTP server, the first row, first column data has a leading question mark. Can anyone help what is causing this problem.

Sample Data:
?2015061 abcdefg
20150616 efghijkl
20150617 mnopqr

Other info. NLS of stage is also on UTF-8.
Final Delimiter = end
Record Delimiter String = Dos Format
Delimiter = |
Null Field Value = ''
Quote = none


Thanks you very much.

Posted: Mon Oct 19, 2015 10:03 pm
by ray.wurlod
My guess is that it's a Byte Order Mark.

Posted: Tue Oct 20, 2015 12:17 am
by Kel
Hi Ray, do you have any idea on how can I remove this?
We have tried the Strip BOM = true function on the sequential file but it is not removing the leading question mark.

Thank you very much.

Posted: Tue Oct 20, 2015 12:55 am
by priyadarshikunal
Then first check if question mark if there if you create a local file as well. Or if there is an unprintable character in the beginning.

Posted: Tue Oct 20, 2015 2:26 am
by Kel
Hi,

We copied the file to the FTP Server without any suspicious characters (20150615), format is in UTF - 8. Then when the FTP Stage reads it the question mark appears. So we tried to copy the file again from FTP to local path. And the result becomes like this 20150615 (there is a dot on the middle part of 2, I cannot copy-paste the data here.).

Posted: Tue Oct 20, 2015 7:50 am
by chulett
Kel wrote:We copied the file to the FTP Server without any suspicious characters
I'm guessing it's actually there, you're just not able to see it. Have you dumped the hex values in the file? On a UNIX system you could do an "od" or Octal Dump on it, for Windows there are any number of free Hex Editors you can install to use for this task. They would show any "unprintable" characters in the file in a manner that would allow those in the know to identify them.

Posted: Tue Oct 20, 2015 6:43 pm
by Kel
Hi Guys,

We found a workaround for the problem w/ the following way.

1.) Added the column names on first line, increased field length to conform to this. The first line with the column names now contains the question mark character.

2.) Set the Target Sequential File to First Line Column Names = False. The Target Sequential File now has the column names.

3.) On the next job which will use the Sequential File as input, Set First Line Column Names = True. The Question Mark which was on the Column Names is now removed from the stream of data to be processed .

Hope This helps.