Page 1 of 1

Missing character sets on DS server

Posted: Tue Aug 03, 2010 12:03 pm
by ASU_ETL_DEV
Hello,
We are in a PeopleSoft EPM 9.0 environment (Solaris, Oracle 10g, DataStage Server 7.5.2).
Source and target databases have the same character set settings:

Code: Select all

select value from nls_database_parameters where parameter = 'NLS_NCHAR_CHARACTERSET';--AL16UTF16
select value from nls_database_parameters where parameter = 'NLS_CHARACTERSET';--WE8ISO8859P15
DataStage was not passing accented characters to the target.
I checked the character sets installed on our DataStage development server to make sure that it had the database's one:

Code: Select all

#List UNIX server locale
> locale
LANG=
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_ALL=

#List all UNIX server locales installed
> locale -a
C
POSIX
en_CA
en_CA.ISO8859-1
en_CA.UTF-8
en_US
en_US.ISO8859-1
en_US.ISO8859-15
en_US.ISO8859-15@euro
es
es_MX
es_MX.ISO8859-1
fr
fr_CA
fr_CA.ISO8859-1
iso_8859_1 
I applied the following changes to the DS development environment's dsenv and bounced the service:

Code: Select all

	NLS_LANG="AMERICAN_AMERICA.WE8ISO8859P15";export NLS_LANG
	LANG="en_US.ISO8859-15";export LANG
	LC_ALL="en_US.ISO8859-15";export LC_ALL
With these settings the accented characters showed up in the target database.
I did not have to re-install DataStage with NLS turned on and I did not have to specify any NLS parameters in projects or jobs.

I performed the same change in the production environment and the accented characters were passed to the target.
At the same time one of the jobs in our nightly run that is backing up a hashed file to a sequential one got the following abort:

Code: Select all

'couldn't set locale correctly'
I checked the locale settings for the production DS server (I had assumed that they were the same as in the development box):

Code: Select all

> locale
LANG=
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_ALL=

> locale -a 
C 
POSIX
iso_8859_1
Question: Can I safely assume that after installing the missing character sets on the production server, this problem will be resolved?
Thanks

Posted: Wed Aug 04, 2010 2:56 am
by ArndW
Odd, as far as I know, the 3 environment variables NLS_LANG, LANG, and LC_ALL are ORACLE values and are not used by the DataStage server engine.

Could it be that the production server was not installed with NLS enabled? This would explain the missing character set definitions.

Posted: Wed Aug 04, 2010 8:34 am
by ASU_ETL_DEV
Yes, the production DS server (as well as the development one) was installed without NLS.

Posted: Wed Aug 04, 2010 4:33 pm
by ray.wurlod
Then you should have no character sets at all.

Posted: Wed Aug 04, 2010 5:03 pm
by ASU_ETL_DEV
Ray, your comment suggests that a DataStage NLS installation installs a number of character sets on the server.

In development and in production the NLS button in Administrator is disabled so we do not have an NLS installation.

I do not know how all those character sets got installed on the development server. The result is that in that environment, with those changes to dsenv I can move accented characters from source to target.
I can also run the backup job without that error.

Posted: Wed Aug 04, 2010 5:44 pm
by ray.wurlod
The ones you mentioned earlier are for Oracle, not for DataStage.

If you are using UTF-8 or any other non-ASCII character set, might I suggest that you install DataStage with NLS?

Posted: Wed Aug 04, 2010 8:48 pm
by ASU_ETL_DEV
You certainly may but before I go down the reinstallation route I would like to understand why in development I get the results I need without the NLS install. I read several posts that state that it is not necessary to do an NLS install when all that is needed is a straight move of the data from source to target.

Posted: Wed Aug 04, 2010 9:37 pm
by ray.wurlod
And that is true, in general. Non-NLS can transmit byte values faithfully.

So what's different between the two servers, or the two databases?

Use a Peek stage (or compile in trace mode) to determine whether the accented characters are getting in to DataStage in the first place.