Page 1 of 1

Posted: Thu Jul 23, 2015 4:20 pm
by ray.wurlod
Welcome aboard.

Does not MS-1252 recognise these Windows special characters?

Posted: Fri Jul 24, 2015 9:12 am
by chulett
Can you not ask whomever created the file for you what encoding it is using?

Posted: Fri Jul 24, 2015 11:05 am
by QualityStageatGM
It came in with Oracle's version of UTF-8.

I'll try creating a rule and see if the rule will correctly identify each special character as different from the others.

Thanks.

Re: NLS Map for Special Characters

Posted: Sun Jul 26, 2015 6:47 pm
by weiyi_will
Did you try saving the text files as UTF-8 manually and set NLS as UTF on JOB property and set datatype as NvarChar on column definition?

Posted: Mon Aug 10, 2015 2:20 pm
by QualityStageatGM
Sorry for the late reply, was out of office for a couple days. I've checked the job parameters and the stage properties and they are both UTF-8. It seems that the box-like characters are having trouble being correctly transferred.

Posted: Tue Aug 18, 2015 12:42 pm
by QualityStageatGM
Would you guys know what the most encompassing NLS map name is?

Thanks.

Posted: Wed Aug 19, 2015 1:01 am
by priyadarshikunal
can you get the hex dump of those characters?

Posted: Wed Aug 19, 2015 8:13 am
by QualityStageatGM
I ran xxd on a file I created with ▒, ╦, ┼ type characters
Below is the output
Please let me know if this is what you're looking for or if you need a different file

Code: Select all

0000000: efbb bf22 496e 666f 7370 6865 7265 496e  ..."InfosphereIn
0000010: 666f 726d 6174 696f 6e22 0d0a 2249 6e66  formation".."Inf
0000020: 6f73 7068 6572 e296 9249 6e66 6f72 6d61  ospher...Informa
0000030: 7469 6f6e 220d 0a22 496e 666f 7370 6865  tion".."Infosphe
0000040: 72c3 ac49 6e66 6f72 6d61 7469 6f6e 220d  r..Information".
0000050: 0a22 496e 666f 7370 6865 72c5 9549 6e66  ."Infospher..Inf
0000060: 6f72 6d61 7469 6f6e 220d 0a22 496e 666f  ormation".."Info
0000070: 7370 6865 72c7 9d49 6e66 6f72 6d61 7469  spher..Informati
0000080: 6f6e 220d 0a22 496e 666f 7370 6865 72cf  on".."Infospher.
0000090: 81c7 9c49 6e66 6f72 6d61 7469 6f6e 220d  ...Information".
00000a0: 0a22 496e 666f 7370 6865 72e2 94bc 496e  ."Infospher...In
00000b0: 666f 726d 6174 696f 6e22 0d0a 2249 6e66  formation".."Inf
00000c0: 6f73 7068 6572 e295 a649 6e66 6f72 6d61  ospher...Informa
00000d0: 7469 6f6e 220d 0a22 496e 666f 7370 6865  tion".."Infosphe
00000e0: 7265 2049 6e66 6f72 6d61 7469 6f6e 220d  re Information".
00000f0: 0a22 496e 666f 7370 6865 7249 6e66 6f72  ."InfospherInfor
0000100: 6d61 7469 6f6e e296 9222 0d0a 2249 6e66  mation...".."Inf
0000110: 6f73 7068 6572 496e 666f 726d 6174 696f  ospherInformatio
0000120: 6ec3 ac22 0d0a 2249 6e66 6f73 7068 6572  n..".."Infospher
0000130: 496e 666f 726d 6174 696f 6ec5 9522 0d0a  Information.."..
0000140: 2249 6e66 6f73 7068 6572 496e 666f 726d  "InfospherInform
0000150: 6174 696f 6ec7 9d22 0d0a 2249 6e66 6f73  ation..".."Infos
0000160: 7068 6572 496e 666f 726d 6174 696f 6ecf  pherInformation.
0000170: 81c7 9c22 0d0a 2249 6e66 6f73 7068 6572  ...".."Infospher
0000180: 496e 666f 726d 6174 696f 6ee2 94bc 220d  Information...".
0000190: 0a22 496e 666f 7370 6865 7249 6e66 6f72  ."InfospherInfor
00001a0: 6d61 7469 6f6e e295 a622 0d0a 2249 6e66  mation...".."Inf
00001b0: 6f73 7068 6572 6520 496e 666f 726d 6174  osphere Informat
00001c0: 696f 6ee2 95a6 220d 0a                   ion..."..

Posted: Wed Aug 19, 2015 4:55 pm
by ray.wurlod
We can't see your "special" characters in this dump. Can you highlight the hex codes that correspond to them?

Posted: Thu Aug 20, 2015 4:59 pm
by ray.wurlod
So, just looking at the first one, the hex bytes between "Infospher" and "Information" are e296 9249 (positions 0026 through 0029 in your dump). You will need to find an appropriate Unicode map that tells you what this four-byte sequence is supposed to be. You will also need to find out how these "rogue" byte sequences got into the data in the first place, and take steps to remediate that.