Page 1 of 1

How to read the multibyte character as byte by byte

Posted: Mon May 21, 2007 6:28 pm
by vijaydev
i want to read multy byte character as single byte by byte to find out total lenth its occupied including single byte characters

Posted: Mon May 21, 2007 7:14 pm
by chulett
Never had to do it, but if I had to I would go to the online help Index and type byte to see what turned up.

Posted: Mon May 21, 2007 8:33 pm
by ray.wurlod
The functions you require are listed in this post.

Posted: Mon May 21, 2007 8:40 pm
by vijaydev
my aim is to find out and to remove the broken character normally multi byte caracter set will occupy two byte for one character but broken char will occupty single byte that why i am trying to find it out the lenth if you have any answers to find it out let me know

Posted: Tue May 22, 2007 1:15 am
by ray.wurlod
I still can't see that you have any proof that a character is "broken" whatever that means.

It is possible to read one byte at a time and to determine the byte value and byte type, but you will need to write your own routine, and specify NONE as the map - otherwise the map will attempt to convert the Korean characters into UV-UTF8 encoding of Unicode, which is used internally by DataStage.

The functions you may need are:
BYTELEN()
BYTE()
BYTETYPE()
BYTEVAL()

All of these can be found in the DataStage BASIC manual. You will need to have a reference manual for the specific encoding (for example PC1040, KSC5601) so that you understand what each individual byte in the multi-byte encoding is doing.

Posted: Tue May 22, 2007 2:52 am
by vijaydev
in job i am getting korean characters as input, actully in front end they are copy pasting the large text to the text box more than size while pressing the enter in two bytes one byte is storing in to the database i want to remove that character, in datastage it showes as ? and database it showing some thing else i want remove that character pls help and provide me the logic also if possible

Posted: Wed May 23, 2007 5:19 am
by vijaydev
Any one help me to read byte by byte korean char or tell me how to remove half byte