Conversion of Chinese characters (sap ext and load into ora)

pbatchu · Post by **pbatchu** » Fri Nov 10, 2006 6:33 pm

Hi All,
We are in particular situation. We do extract SAP data using CDC tool. SAP database is Oracle.
We want to load the extracted data into datamart. Db for datamart is Oracle. We are using Datastage PX job to load the data.
I tried with different combinations but could not succeed in loading foreign langulages. What kind of conversion is needed for this.

Any pointer or sharing experience in this regard is apreciated.

Thanks,
Pavan

ArndW · Post by **ArndW** » Mon Nov 13, 2006 5:45 am

Pavan,

are you using NLS in your DataStage implementation? It is not necessary to have NLS to do this, but whether you have it installed or not does change how you would go about doing this. You will also need to identify where in the process your Chinese (BIG-5?) is being incorrectly converted.

pbatchu · Post by **pbatchu** » Mon Nov 13, 2006 12:11 pm

Hi,
I tried to use NLS map. But did not give correct results. I applied NLS Map on Oracle Enterprise stage.
When I looked at the data in excel, I see chinese characters but these are not correct. I did check with SAP instance by logging into SAP instance with that language.
I applied Oracle (my database) NLS characters in ORacle Enterprise stage.

The characters are AMERICAN_AMERICA.ZHS16CGB231280

How can I identify incorrect conversion in process. It looks like you have experience in this regard. Your help is appreciated.

Thanks,
Pavan

ArndW · Post by **ArndW** » Mon Nov 13, 2006 1:48 pm

PAvan,

takes things step-by-step. Read the SAP data and write straight to a text file, keeping the same chines character set (are you 100% that the SAP source is in ZHS16CGB231280?). Is it correct?

trokosz · Post by **trokosz** » Mon Nov 13, 2006 3:45 pm

What is the character set that was established in the Oracle database? It must be in synb with the approach in the DS Job. So, for example, in DS you may define columns to be varchar Unicode or char Unicode (nchar and nvarchar) and then Oracle has NLS_NCHAR_CHARACTERSET: AL32UTF8..

So syncing your NLS Maps acorss databases and using the proper DS metadata designations is the key...

ray.wurlod · Post by **ray.wurlod** » Mon Nov 13, 2006 4:22 pm

It's a little more complex than explained. A continuous process captures changed rows in SAP into a text file. A new file is opened every hour. The SAP records in this file may contain Chinese characters as well as English.

DataStage initiates (via BAPI) an ABAP process that consolidates the hourly files into a single file. This file still contains Chinese characters.

It is possible, however, that changes in Japanese and other data may also be captured by this process, and consolidated into the single file. This fact was not reported in the original post.

(Please note, I am at the site at present - this is not a case of RMM stage!)

It will make it very difficult, if not impossible, for DataStage to read this sequential file if there is a mixture of rows encoded differently in the file. The SAP-based processes need to be reviewed so that there is only one encoding of characters in any one file to be processed by DataStage.