Leading Question Mark Appears on First row of data

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Kel
Participant
Posts: 31
Joined: Mon May 11, 2015 3:20 am
Location: Robinsons Cybergate Tower 2
Contact:

Leading Question Mark Appears on First row of data

Post by Kel »

Hi,

I have a file which is encoded in UTF-8 and has EOL = CR/LF.
When we are reading this file on the FTP server, the first row, first column data has a leading question mark. Can anyone help what is causing this problem.

Sample Data:
?2015061 abcdefg
20150616 efghijkl
20150617 mnopqr

Other info. NLS of stage is also on UTF-8.
Final Delimiter = end
Record Delimiter String = Dos Format
Delimiter = |
Null Field Value = ''
Quote = none


Thanks you very much.
BOG
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

My guess is that it's a Byte Order Mark.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Kel
Participant
Posts: 31
Joined: Mon May 11, 2015 3:20 am
Location: Robinsons Cybergate Tower 2
Contact:

Post by Kel »

Hi Ray, do you have any idea on how can I remove this?
We have tried the Strip BOM = true function on the sequential file but it is not removing the leading question mark.

Thank you very much.
BOG
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

Then first check if question mark if there if you create a local file as well. Or if there is an unprintable character in the beginning.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
Kel
Participant
Posts: 31
Joined: Mon May 11, 2015 3:20 am
Location: Robinsons Cybergate Tower 2
Contact:

Post by Kel »

Hi,

We copied the file to the FTP Server without any suspicious characters (20150615), format is in UTF - 8. Then when the FTP Stage reads it the question mark appears. So we tried to copy the file again from FTP to local path. And the result becomes like this 20150615 (there is a dot on the middle part of 2, I cannot copy-paste the data here.).
BOG
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Kel wrote:We copied the file to the FTP Server without any suspicious characters
I'm guessing it's actually there, you're just not able to see it. Have you dumped the hex values in the file? On a UNIX system you could do an "od" or Octal Dump on it, for Windows there are any number of free Hex Editors you can install to use for this task. They would show any "unprintable" characters in the file in a manner that would allow those in the know to identify them.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Kel
Participant
Posts: 31
Joined: Mon May 11, 2015 3:20 am
Location: Robinsons Cybergate Tower 2
Contact:

Post by Kel »

Hi Guys,

We found a workaround for the problem w/ the following way.

1.) Added the column names on first line, increased field length to conform to this. The first line with the column names now contains the question mark character.

2.) Set the Target Sequential File to First Line Column Names = False. The Target Sequential File now has the column names.

3.) On the next job which will use the Sequential File as input, Set First Line Column Names = True. The Question Mark which was on the Column Names is now removed from the stream of data to be processed .

Hope This helps.
BOG
Post Reply