File created with encoding other than specified in NLS.

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
theverma
Participant
Posts: 91
Joined: Tue Jul 18, 2006 10:41 am
Location: India

File created with encoding other than specified in NLS.

Post by theverma »

Hello Friends,

I have an ANSI File.In Datastage,I am reading that file as UTF-8(by specifying NLS option as UTF-8 in Sequential file stage properties).
And at the output,I am creating a UTF-8 Encoded File(by specifying NLS option as UTF-8 in Sequential file stage properties).

But the File Encoding is ANSI(Windows 1252: Western European).Although the file doesnot contain any UTF-8 character,i suppose that when i am specifying the output file Endocing as UTF-8 in NLS option of Sequential File,the output File should have UTF-8 Encoding.

Earlier my Project default NLS option was MS-1252.I have changed this also to UTF-8 by entering into Administrator and then setting the NLS option for that project as UTF-8.

Do we have to Restart the services or there is some other method to change the project level default NLS settings?
And also why the files are getting created with Windows 1252: Western European,even when the NLS setting is UTF-8.

Please suggest,
Thanks
Arun Verma
gbusson
Participant
Posts: 98
Joined: Fri Oct 07, 2005 2:50 am
Location: France
Contact:

Post by gbusson »

Hi,

You can overwrite your NLS settings in a import or export stage.
check the NLS box of your stage.


To change the charset, the strings must have Nchar,Nvarchar or char[unicode] types! Otherwise, data will be trated as bytes, and not translated!
theverma
Participant
Posts: 91
Joined: Tue Jul 18, 2006 10:41 am
Location: India

Post by theverma »

I perform one more test with a file containing Unicode characters.And now the output file is also UTF-8(as specified in the Stage NLS setting).

But I want to know that why the File format was ANSI,when the input file doesnot contain any Unicode characters.
Is this because i have MS1252 NLS setting as Project default?

But i have changed the project default also,but even then the File encoding was ANSI(when the Input file doesnot contain any unicode characters).

Thanks for your reply.

Thanks
Arun Verma
gbusson
Participant
Posts: 98
Joined: Fri Oct 07, 2005 2:50 am
Location: France
Contact:

Post by gbusson »

I do not understand.

NLs is set up on runtime.

and what do you mean with "when the Input file doesnot contain any unicode characters"?
theverma
Participant
Posts: 91
Joined: Tue Jul 18, 2006 10:41 am
Location: India

Post by theverma »

we can set NLS option at stage level also.In Sequential file stage properties,we can set it.
I have set it as UTF-8 in my Input Sequential file stage and output Sequential file stage.Input file doesnot contain uncode characters means all the characters would contain 1 byte .There is no character which would take 2 or more than 2 bytes.

Hope u understand.
Thanks
Arun Verma
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

UTF-8 is not a standard - there are many variants of UTF-8. DataStage itself uses an idiosyncratic encoding called UV-UTF8. UTF = "Unicode transformation format" and the 8 means that it's an eight bit encoding (there are also 16-bit encodings).

You might like to try ISO 8859-1 (sometimes erroneously referred to as "eight bit ASCII") as your character map.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply