Page 1 of 1

Viewing Korean Characters in Information Analyzer

Posted: Thu Dec 18, 2008 12:57 pm
by sigma
I have a Information Analyzer issue. The data is from Oracle table and is Korean characters stored as UTF 8 in oracle table

When I import the data in excel it looks Korean as normal ( 3rd column below is NAME field)

1 100 대표 2008-12-18 08:56:54
1 101 안중태 2008-12-18 08:56:54
1 102 천정균 2008-12-18 08:56:54
1 103 이숙자 2008-12-18 08:56:54
1 104 김형래 2008-12-18 08:56:54
1 105 하현준 2008-12-18 08:56:54
1 106 김연규 2008-12-18 08:56:54
1 107 채희권 2008-12-18 08:56:54
1 108 김용애 2008-12-18 08:56:54
1 109 허성 2008-12-18 08:56:54

When I import and run column analyzis on the above table ( first 3 columns) then it does the column analzis just fine but when I view or drill data it does not show Korean characters and also I do not believe it gives the <b>cardinality accurately for the Name coulmn</b>


Is there a specific process to be followeed for profiling foriegn langauge data stored as UTF8


We do have NLS installed on the server.

In fact this data file is created by a job using the UTF-8 character map

Please advice

Posted: Thu Dec 18, 2008 3:20 pm
by ray.wurlod
There is not a single "UTF-8" encoding. Unicode Transformation Format (8-bit) is implemented differently in different places. See Unicode Consortium website for more information.

How was the source file created? Are you sure it wasn't one of the Korean-specific character maps? Have you tried any of these with DataStage?

Posted: Fri Dec 19, 2008 9:11 am
by sigma
<B>How was the source file created?</B>

Thanks Ray.
The Korean customers gave us a excel file with sample data
1) the excel sheet was saved as unicode text.( tab seperated)
2) The unicode.text file was converted into a utf-8 encoding using .NET classes
3) The new utf-8 file was read by datastage job using UTF-8 nls-map and only a few fields were passed on.( first 3)
4) The datastage job loaded oracle table and created a flat file with 3 fields
5) When I import the data from oracle table I see Korean characters just fine
6) But in IA it does not look Korean at all

<B> Question </b>
When I do view data of the utf-8 fle created by step 2 I do not see Korean characters. I am assuming datastage is still converting okay as it loads into oracle just fine

Posted: Fri Dec 19, 2008 9:12 am
by sigma
Sorry about the Bold characters. My apologies,I only intended to questions to be bolded but must have missed the ending tag /

Posted: Fri Dec 19, 2008 9:37 am
by chulett
So, go back and edit it.

Posted: Fri Dec 19, 2008 10:10 am
by sigma
Thanks, I have edited so it is not bold anymore

Any suggestions for my real problem