WE8ISO8859P1 vs AL32UTF8 vs UTF-8
Posted: Mon Sep 10, 2012 8:50 pm
All,
We are using 18 databases for our project. Datastage job will load the data into ORACLE. The NLS setting in datastage side is UTF-8.
Background:
In the datastage job, we are using ODBC stage to load the data into ORACLE. we used NVARCHAR in 8 version. We recently migrated datastage from 8 to 8.5. The major change which we did for this migration was " changed NVARCHAR to VARCHAR in all th ODBC stage". This exerciese is done for more than 2000 jobs.
Now, we faced some issue in PROD related to this datatype in one database. so we changed VARCHAR to NVARCHAR only for that particulat database and particular job.
I am new to this project. I am in the position to find the root cause. what is happening actually in between ORACLE and DATASTAGE.
The first step of analysis is to understand the CHARSET & NCHARSET defied in ORACLE side. I did that and noticed the below difference in that particular database.
SELECT * FROM sys.props$ WHERE name = 'NLS_CHARACTERSET' ;
NLS_CHARACTERSET AL32UTF8
SELECT *
FROM NLS_SESSION_PARAMETERS;
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CURRENCY $
NLS_ISO_CURRENCY AMERICA
NLS_NUMERIC_CHARACTERS .,
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT DD-MON-RR
NLS_DATE_LANGUAGE AMERICAN
NLS_SORT BINARY
NLS_TIME_FORMAT HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY $
NLS_COMP BINARY
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE
SELECT * FROM NLS_DATABASE_PARAMETERS;
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_LANGUAGE ENGLISH
NLS_TERRITORY UNITED KINGDOM
NLS_CURRENCY #
NLS_ISO_CURRENCY UNITED KINGDOM
NLS_NUMERIC_CHARACTERS .,
NLS_CHARACTERSET AL32UTF8
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT DD-MON-RR
NLS_DATE_LANGUAGE ENGLISH
NLS_SORT BINARY
NLS_TIME_FORMAT HH24.MI.SSXFF
NLS_TIMESTAMP_FORMAT DD-MON-RR HH24.MI.SSXFF
NLS_TIME_TZ_FORMAT HH24.MI.SSXFF TZR
NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH24.MI.SSXFF TZR
NLS_DUAL_CURRENCY ?
NLS_COMP BINARY
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE
NLS_RDBMS_VERSION 10.2.0.4.0
Other Databases:
SELECT * FROM sys.props$ WHERE name = 'NLS_CHARACTERSET' ;
NLS_CHARACTERSET WE8ISO8859P1
SELECT *
FROM NLS_SESSION_PARAMETERS;
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CURRENCY $
NLS_ISO_CURRENCY AMERICA
NLS_NUMERIC_CHARACTERS .,
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT DD-MON-RR
NLS_DATE_LANGUAGE AMERICAN
NLS_SORT BINARY
NLS_TIME_FORMAT HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY $
NLS_COMP BINARY
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE
SELECT * FROM NLS_DATABASE_PARAMETERS;
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CURRENCY $
NLS_ISO_CURRENCY AMERICA
NLS_NUMERIC_CHARACTERS .,
NLS_CHARACTERSET WE8ISO8859P1
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT DD-MON-RR
NLS_DATE_LANGUAGE AMERICAN
NLS_SORT BINARY
NLS_TIME_FORMAT HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY $
NLS_COMP BINARY
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_RDBMS_VERSION 10.2.0.4.0
I am sure this is causing because of some wrong CHARACTERSET defined either in DB or ETL side. I need some experts help to go further and start my second step analysis.
We are using 18 databases for our project. Datastage job will load the data into ORACLE. The NLS setting in datastage side is UTF-8.
Background:
In the datastage job, we are using ODBC stage to load the data into ORACLE. we used NVARCHAR in 8 version. We recently migrated datastage from 8 to 8.5. The major change which we did for this migration was " changed NVARCHAR to VARCHAR in all th ODBC stage". This exerciese is done for more than 2000 jobs.
Now, we faced some issue in PROD related to this datatype in one database. so we changed VARCHAR to NVARCHAR only for that particulat database and particular job.
I am new to this project. I am in the position to find the root cause. what is happening actually in between ORACLE and DATASTAGE.
The first step of analysis is to understand the CHARSET & NCHARSET defied in ORACLE side. I did that and noticed the below difference in that particular database.
SELECT * FROM sys.props$ WHERE name = 'NLS_CHARACTERSET' ;
NLS_CHARACTERSET AL32UTF8
SELECT *
FROM NLS_SESSION_PARAMETERS;
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CURRENCY $
NLS_ISO_CURRENCY AMERICA
NLS_NUMERIC_CHARACTERS .,
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT DD-MON-RR
NLS_DATE_LANGUAGE AMERICAN
NLS_SORT BINARY
NLS_TIME_FORMAT HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY $
NLS_COMP BINARY
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE
SELECT * FROM NLS_DATABASE_PARAMETERS;
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_LANGUAGE ENGLISH
NLS_TERRITORY UNITED KINGDOM
NLS_CURRENCY #
NLS_ISO_CURRENCY UNITED KINGDOM
NLS_NUMERIC_CHARACTERS .,
NLS_CHARACTERSET AL32UTF8
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT DD-MON-RR
NLS_DATE_LANGUAGE ENGLISH
NLS_SORT BINARY
NLS_TIME_FORMAT HH24.MI.SSXFF
NLS_TIMESTAMP_FORMAT DD-MON-RR HH24.MI.SSXFF
NLS_TIME_TZ_FORMAT HH24.MI.SSXFF TZR
NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH24.MI.SSXFF TZR
NLS_DUAL_CURRENCY ?
NLS_COMP BINARY
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE
NLS_RDBMS_VERSION 10.2.0.4.0
Other Databases:
SELECT * FROM sys.props$ WHERE name = 'NLS_CHARACTERSET' ;
NLS_CHARACTERSET WE8ISO8859P1
SELECT *
FROM NLS_SESSION_PARAMETERS;
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CURRENCY $
NLS_ISO_CURRENCY AMERICA
NLS_NUMERIC_CHARACTERS .,
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT DD-MON-RR
NLS_DATE_LANGUAGE AMERICAN
NLS_SORT BINARY
NLS_TIME_FORMAT HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY $
NLS_COMP BINARY
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE
SELECT * FROM NLS_DATABASE_PARAMETERS;
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CURRENCY $
NLS_ISO_CURRENCY AMERICA
NLS_NUMERIC_CHARACTERS .,
NLS_CHARACTERSET WE8ISO8859P1
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT DD-MON-RR
NLS_DATE_LANGUAGE AMERICAN
NLS_SORT BINARY
NLS_TIME_FORMAT HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY $
NLS_COMP BINARY
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_RDBMS_VERSION 10.2.0.4.0
I am sure this is causing because of some wrong CHARACTERSET defined either in DB or ETL side. I need some experts help to go further and start my second step analysis.