Page 1 of 1

Error Profiling Japanese Character

Posted: Thu Jan 29, 2009 2:28 am
by sweta rai
Hi ,

I am trying to do a column analysis on a column which is having Japanese character in it . But it is not able to profile those data and giving a warning like :

pxbridge: [IIS-CONN-DAAPI-000067] Schema reconciliation detected a size mismatch for field SUPNM. When moving data from source field type STRING(min=0,max=135,charset=windows-1252) into target field type STRING(min=0,max=135,charset=UTF-8 ), truncation, loss of precision or data corruption can occur. Use STRING(min=0,max=405,charset=UTF-8) for target type



The Data source is Oracle database where the UTF-8 setting is done so the japanese characters are correctly populated there .

The driver which we are using is "IBM Oracle Wire Protocol".

What i could understand from the above warning is that the driver which we are using is not able to read UTF-8 data and its default charset is windows-1252.

Do i need to use other driver to connect to the oracle data source or something else needs to be done ??

Kindly somebody suggest . I'm sort of stuck in this problem .

Posted: Thu Jan 29, 2009 3:15 am
by ray.wurlod
Code page 1252 (US English) will not handle Japanese characters. You need to find out how the Japanese characters are encoded and specify the same mapping for DataStage to use. In this case, and provided that the source is encoded using UTF-8, you need to vary the source metadata.

Posted: Thu Jan 29, 2009 3:43 am
by sweta rai
Hi ray ,

Do i need to set the code page for Japanese character while loading the data to Oracle Table ?

OR

Do i need to set the NLS for Japanese character in the IIS Administrator ?Earlier it was set to UTF-8.

Please clarify me.

Posted: Thu Jan 29, 2009 2:46 pm
by ray.wurlod
None of that is relevant until you can successfully READ the Japanese characters. You need to establish how these are encoded in your source table, and set the mapping in DataStage to correspond to that. Based on your opening question, the target already uses UTF-8, but I'd check that anyway.

Posted: Thu Jan 29, 2009 10:58 pm
by sweta rai
Ray , I'm afraid i cud not get you what exactly you want to say ....

The Datastage job which populated that oracle table from source file has all encoding done correctly .
So , the data in the oracle table are stored correctly and we are able to read the Japanese data properly in the table.

Now For profiling ..we do not need to design any job . We just imported and binded that oracle data source in information Analyzer and doing column analysis ..... where might be the japanese characters are not being read properly and giving the wrong analysis result .

What exactly needs to be done ?

Re: Error Profiling Japanese Character

Posted: Thu Jan 29, 2009 11:04 pm
by ray.wurlod
sweta rai wrote:for field SUPNM. When moving data from source field type STRING(min=0,max=135,charset=windows-1252)
This tells me that either the source table has not been loaded properly or that your metadata are inconsistent.

Posted: Thu Jan 29, 2009 11:33 pm
by sweta rai
My apologies ...

Let me try to clear it to you :

The source oracle table has the Japanese data correctly populated .
But when we are importing that meta-data in the IA Repository considering as its source ; its not importing those data correctly and converting it to charset=windows-1252 .

Although the IA server at its end has its encoding ( UTF - 8 ) done correctly at the target ..thts why the message "into target field type STRING(min=0,max=135,charset=UTF-8 )"

Correct me if i'm wrong and help me in this regard .

Posted: Fri Jan 30, 2009 12:55 am
by ray.wurlod
Report the metadata import bug to your support provider, then change the setting within IA to UTF-8 manually.

Posted: Fri Jan 30, 2009 2:19 am
by sweta rai
Hi ray ,
Thanks for your effort .

We got the solution .

We need to add the following two parameters in the DataStage Administrator :

1.NLS_LANG and set its value to AMERICAN_AMERICA.UTF8
2.DB2CODEPAGE and set its value to 1208

Posted: Fri Jan 30, 2009 2:50 am
by ray.wurlod
I had assumed that NLS_LANG was OK because you could access Japanese data in other tables satisfactorily.

You did not mention DB2 in your original question, so I assume that DB2CODEPAGE relates to your common metadata repository (XMETA) and its associated services or that your IA database (IADB) is DB2.

Posted: Fri Jan 30, 2009 4:57 am
by sweta rai
Yes , because the default analysis database IADB for IIS is DB2 and we need to explicitly mention its codepage .