How to read the multibyte character as byte by byte
Moderators: chulett, rschirm, roy
How to read the multibyte character as byte by byte
i want to read multy byte character as single byte by byte to find out total lenth its occupied including single byte characters
Vijay
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The functions you require are listed in this post.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I still can't see that you have any proof that a character is "broken" whatever that means.
It is possible to read one byte at a time and to determine the byte value and byte type, but you will need to write your own routine, and specify NONE as the map - otherwise the map will attempt to convert the Korean characters into UV-UTF8 encoding of Unicode, which is used internally by DataStage.
The functions you may need are:
BYTELEN()
BYTE()
BYTETYPE()
BYTEVAL()
All of these can be found in the DataStage BASIC manual. You will need to have a reference manual for the specific encoding (for example PC1040, KSC5601) so that you understand what each individual byte in the multi-byte encoding is doing.
It is possible to read one byte at a time and to determine the byte value and byte type, but you will need to write your own routine, and specify NONE as the map - otherwise the map will attempt to convert the Korean characters into UV-UTF8 encoding of Unicode, which is used internally by DataStage.
The functions you may need are:
BYTELEN()
BYTE()
BYTETYPE()
BYTEVAL()
All of these can be found in the DataStage BASIC manual. You will need to have a reference manual for the specific encoding (for example PC1040, KSC5601) so that you understand what each individual byte in the multi-byte encoding is doing.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
in job i am getting korean characters as input, actully in front end they are copy pasting the large text to the text box more than size while pressing the enter in two bytes one byte is storing in to the database i want to remove that character, in datastage it showes as ? and database it showing some thing else i want remove that character pls help and provide me the logic also if possible
Vijay