Error reading Japanese data

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
BradMiller
Premium Member
Premium Member
Posts: 87
Joined: Mon Feb 18, 2008 3:58 pm
Location: Sacramento, CA

Error reading Japanese data

Post by BradMiller »

I have a requirement where I need to read japanese data from .doc fie and write it into DB2 database table.DataStage and DB2 database are both NLS installations.The NLS_LANG which is being set on operating system is EN_US but operating system also has unicode locales being installed.Now I am reading data from sequential file with NLS_Map being set as EN_US.UTF-8 but still its not able to read it.Do I need to export the NLS_LANG to EN_US.UTF-8 in dsenv environment variable and leave operatng system locale as it.Would this resolve the problem if not could you please suggest me what steps need to be done as I have done this 6 years back and dont remember the procedure.And do I need to do any changes on the database side too.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Japanese data is fraught - there are at least 14 different encodings in which it might be supplied. Your first step will need to be to discover how the data are actually encoded. A common map is SHIFT_JIS but there are even variants of that. You need to ask the providers of the data to be as complete and explicit as possible.

You may encounter other "nice" things, particularly if the data are sourced from mainframes, like using different encodings in different columns or even changing encodings part way through a string (signalled by Shift-In, Shift-Out characters (Char(15) and Char(16)).

I personally believe that server jobs do a better job with Japanese data than parallel jobs, but I have not had the opportunity to test my theory in version 8.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply