Character Conversion

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

This looks like a typical NLS issue.

First of all, you need to find out in what character set the data was stored in DB2. If the characters were converted to EBCDIC500, which doesn't support Thai characters, then your NLS information is --poof-- gone forever.

If the data is entered into the database in the character set it was typed in as, i.e. with no conversion, then in DataStage you need to enter that database as the NLS source type. Likewise, if the data was converted from native Thai to a character set that maps the Thai letters then you need to specify that as the DB2 source character set.

Once DataStage knows what character set to use while reading, it can then perform mapping (where possible) to another character set.
sid19
Participant
Posts: 64
Joined: Mon Jun 18, 2007 12:17 am
Location: kolkata

Post by sid19 »

Hi,

In Iseries DB2, EBCDIC500 is the character set and Thai characters are converted to EBCDIC500.

Thanks
Sid
arunkumarmm
Participant
Posts: 246
Joined: Mon Jun 30, 2008 3:22 am
Location: New York
Contact:

Post by arunkumarmm »

Then I believe, you should use the same NLS map while reading it as well. Did you try that already?
Arun
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

The EBCDIC 500 Character set support LATIN-1 encoding. It is a single-byte system where all the position are occupied by characters, trying to encode any additional ones would mean that single bytes could mean multiple encodings... so if you read an 0x44 byte it could mean either the EBCDIC500 entry or perhaps a mapped Thai character so there's absolutely no way for DataStage to determine what the character could represent.

Perhaps the original 1-byte Thai characters have just been written, 1-1, into the table, in which case you could use a Thai-character set to read it. If that wasn't done, your data is corrupted and unuseable.
Post Reply