Conversion of Chinese characters (sap ext and load into ora)

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
pbatchu
Charter Member
Charter Member
Posts: 20
Joined: Thu Aug 17, 2006 11:53 am
Location: Boise

Conversion of Chinese characters (sap ext and load into ora)

Post by pbatchu »

Hi All,
We are in particular situation. We do extract SAP data using CDC tool. SAP database is Oracle.
We want to load the extracted data into datamart. Db for datamart is Oracle. We are using Datastage PX job to load the data.
I tried with different combinations but could not succeed in loading foreign langulages. What kind of conversion is needed for this.


Any pointer or sharing experience in this regard is apreciated.

Thanks,
Pavan
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Pavan,

are you using NLS in your DataStage implementation? It is not necessary to have NLS to do this, but whether you have it installed or not does change how you would go about doing this. You will also need to identify where in the process your Chinese (BIG-5?) is being incorrectly converted.
pbatchu
Charter Member
Charter Member
Posts: 20
Joined: Thu Aug 17, 2006 11:53 am
Location: Boise

Post by pbatchu »

Hi,
I tried to use NLS map. But did not give correct results. I applied NLS Map on Oracle Enterprise stage.
When I looked at the data in excel, I see chinese characters but these are not correct. I did check with SAP instance by logging into SAP instance with that language.
I applied Oracle (my database) NLS characters in ORacle Enterprise stage.

The characters are AMERICAN_AMERICA.ZHS16CGB231280

How can I identify incorrect conversion in process. It looks like you have experience in this regard. Your help is appreciated.

Thanks,
Pavan
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

PAvan,

takes things step-by-step. Read the SAP data and write straight to a text file, keeping the same chines character set (are you 100% that the SAP source is in ZHS16CGB231280?). Is it correct?
trokosz
Premium Member
Premium Member
Posts: 188
Joined: Thu Sep 16, 2004 6:38 pm
Contact:

Post by trokosz »

What is the character set that was established in the Oracle database? It must be in synb with the approach in the DS Job. So, for example, in DS you may define columns to be varchar Unicode or char Unicode (nchar and nvarchar) and then Oracle has NLS_NCHAR_CHARACTERSET: AL32UTF8..

So syncing your NLS Maps acorss databases and using the proper DS metadata designations is the key...
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It's a little more complex than explained. A continuous process captures changed rows in SAP into a text file. A new file is opened every hour. The SAP records in this file may contain Chinese characters as well as English.

DataStage initiates (via BAPI) an ABAP process that consolidates the hourly files into a single file. This file still contains Chinese characters.

It is possible, however, that changes in Japanese and other data may also be captured by this process, and consolidated into the single file. This fact was not reported in the original post.

(Please note, I am at the site at present - this is not a case of RMM stage!)

It will make it very difficult, if not impossible, for DataStage to read this sequential file if there is a mixture of rows encoded differently in the file. The SAP-based processes need to be reviewed so that there is only one encoding of characters in any one file to be processed by DataStage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply