This looks like a typical NLS issue.
First of all, you need to find out in what character set the data was stored in DB2. If the characters were converted to EBCDIC500, which doesn't support Thai characters, then your NLS information is --poof-- gone forever.
If the data is entered into the database in the character set it was typed in as, i.e. with no conversion, then in DataStage you need to enter that database as the NLS source type. Likewise, if the data was converted from native Thai to a character set that maps the Thai letters then you need to specify that as the DB2 source character set.
Once DataStage knows what character set to use while reading, it can then perform mapping (where possible) to another character set.
Character Conversion
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 246
- Joined: Mon Jun 30, 2008 3:22 am
- Location: New York
- Contact:
The EBCDIC 500 Character set support LATIN-1 encoding. It is a single-byte system where all the position are occupied by characters, trying to encode any additional ones would mean that single bytes could mean multiple encodings... so if you read an 0x44 byte it could mean either the EBCDIC500 entry or perhaps a mapped Thai character so there's absolutely no way for DataStage to determine what the character could represent.
Perhaps the original 1-byte Thai characters have just been written, 1-1, into the table, in which case you could use a Thai-character set to read it. If that wasn't done, your data is corrupted and unuseable.
Perhaps the original 1-byte Thai characters have just been written, 1-1, into the table, in which case you could use a Thai-character set to read it. If that wasn't done, your data is corrupted and unuseable.