I have a Information Analyzer issue. The data is from Oracle table and is Korean characters stored as UTF 8 in oracle table
When I import the data in excel it looks Korean as normal ( 3rd column below is NAME field)
1 100 대표 2008-12-18 08:56:54
1 101 안중태 2008-12-18 08:56:54
1 102 천정균 2008-12-18 08:56:54
1 103 이숙자 2008-12-18 08:56:54
1 104 김형래 2008-12-18 08:56:54
1 105 하현준 2008-12-18 08:56:54
1 106 김연규 2008-12-18 08:56:54
1 107 채희권 2008-12-18 08:56:54
1 108 김용애 2008-12-18 08:56:54
1 109 허성 2008-12-18 08:56:54
When I import and run column analyzis on the above table ( first 3 columns) then it does the column analzis just fine but when I view or drill data it does not show Korean characters and also I do not believe it gives the <b>cardinality accurately for the Name coulmn</b>
Is there a specific process to be followeed for profiling foriegn langauge data stored as UTF8
We do have NLS installed on the server.
In fact this data file is created by a job using the UTF-8 character map
Please advice
Viewing Korean Characters in Information Analyzer
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
There is not a single "UTF-8" encoding. Unicode Transformation Format (8-bit) is implemented differently in different places. See Unicode Consortium website for more information.
How was the source file created? Are you sure it wasn't one of the Korean-specific character maps? Have you tried any of these with DataStage?
How was the source file created? Are you sure it wasn't one of the Korean-specific character maps? Have you tried any of these with DataStage?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
<B>How was the source file created?</B>
Thanks Ray.
The Korean customers gave us a excel file with sample data
1) the excel sheet was saved as unicode text.( tab seperated)
2) The unicode.text file was converted into a utf-8 encoding using .NET classes
3) The new utf-8 file was read by datastage job using UTF-8 nls-map and only a few fields were passed on.( first 3)
4) The datastage job loaded oracle table and created a flat file with 3 fields
5) When I import the data from oracle table I see Korean characters just fine
6) But in IA it does not look Korean at all
<B> Question </b>
When I do view data of the utf-8 fle created by step 2 I do not see Korean characters. I am assuming datastage is still converting okay as it loads into oracle just fine
Thanks Ray.
The Korean customers gave us a excel file with sample data
1) the excel sheet was saved as unicode text.( tab seperated)
2) The unicode.text file was converted into a utf-8 encoding using .NET classes
3) The new utf-8 file was read by datastage job using UTF-8 nls-map and only a few fields were passed on.( first 3)
4) The datastage job loaded oracle table and created a flat file with 3 fields
5) When I import the data from oracle table I see Korean characters just fine
6) But in IA it does not look Korean at all
<B> Question </b>
When I do view data of the utf-8 fle created by step 2 I do not see Korean characters. I am assuming datastage is still converting okay as it loads into oracle just fine
Last edited by sigma on Fri Dec 19, 2008 10:09 am, edited 1 time in total.