How to read the multibyte character as byte by byte

vijaydev · Post by **vijaydev** » Mon May 21, 2007 6:28 pm

i want to read multy byte character as single byte by byte to find out total lenth its occupied including single byte characters

chulett · Post by **chulett** » Mon May 21, 2007 7:14 pm

Never had to do it, but if I had to I would go to the online help Index and type byte to see what turned up.

ray.wurlod · Post by **ray.wurlod** » Mon May 21, 2007 8:33 pm

The functions you require are listed in this post.

vijaydev · Post by **vijaydev** » Mon May 21, 2007 8:40 pm

my aim is to find out and to remove the broken character normally multi byte caracter set will occupy two byte for one character but broken char will occupty single byte that why i am trying to find it out the lenth if you have any answers to find it out let me know

ray.wurlod · Post by **ray.wurlod** » Tue May 22, 2007 1:15 am

I still can't see that you have any proof that a character is "broken" whatever that means.

It is possible to read one byte at a time and to determine the byte value and byte type, but you will need to write your own routine, and specify NONE as the map - otherwise the map will attempt to convert the Korean characters into UV-UTF8 encoding of Unicode, which is used internally by DataStage.

The functions you may need are:
BYTELEN()
BYTE()
BYTETYPE()
BYTEVAL()

All of these can be found in the DataStage BASIC manual. You will need to have a reference manual for the specific encoding (for example PC1040, KSC5601) so that you understand what each individual byte in the multi-byte encoding is doing.

vijaydev · Post by **vijaydev** » Tue May 22, 2007 2:52 am

in job i am getting korean characters as input, actully in front end they are copy pasting the large text to the text box more than size while pressing the enter in two bytes one byte is storing in to the database i want to remove that character, in datastage it showes as ? and database it showing some thing else i want remove that character pls help and provide me the logic also if possible

vijaydev · Post by **vijaydev** » Wed May 23, 2007 5:19 am

Any one help me to read byte by byte korean char or tell me how to remove half byte