how ro remove broken korean character and how to read the byte by byte of Korean text NLS is enabled. now i am using iconv function eg: iconv("inputstring","MY") but its reading two bytes as on character i want to read byte by byte it is possible in datastage pls. help me
Vijay
i want to remove Broken Korean Character
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Welcome aboard.
Yes it is possible, but you will need to write your own routine, and specify NONE as the map - otherwise the map will attempt to convert the Korean characters into UV-UTF8 encoding of Unicode, which is used internally by DataStage.
The functions you may need are:
BYTELEN()
BYTE()
BYTETYPE()
BYTEVAL()
All of these can be found in the DataStage BASIC manual. You will need to have a reference manual for the specific encoding (for example PC1040, KSC5601) so that you understand what each individual byte in the multi-byte encoding is doing.
I seriously doubt that a Korean character can be broken. Korean characters are very simple, consisting of a "consonant", a "vowel" and a "consonant" (for example "Kim"), or a "consonant" and a "vowel" (for example "Ji"), or a "vowel" and a "consonant" (for example "On"). Diphthongs (such as "ng") count as consonants. Character encoding is likewise fairly straightforward, until and unless you need the extremely large formal character set (which I understand is used only by the Korean Navy and diplomatic service).
Yes it is possible, but you will need to write your own routine, and specify NONE as the map - otherwise the map will attempt to convert the Korean characters into UV-UTF8 encoding of Unicode, which is used internally by DataStage.
The functions you may need are:
BYTELEN()
BYTE()
BYTETYPE()
BYTEVAL()
All of these can be found in the DataStage BASIC manual. You will need to have a reference manual for the specific encoding (for example PC1040, KSC5601) so that you understand what each individual byte in the multi-byte encoding is doing.
I seriously doubt that a Korean character can be broken. Korean characters are very simple, consisting of a "consonant", a "vowel" and a "consonant" (for example "Kim"), or a "consonant" and a "vowel" (for example "Ji"), or a "vowel" and a "consonant" (for example "On"). Diphthongs (such as "ng") count as consonants. Character encoding is likewise fairly straightforward, until and unless you need the extremely large formal character set (which I understand is used only by the Korean Navy and diplomatic service).
Last edited by ray.wurlod on Mon May 21, 2007 12:26 am, edited 1 time in total.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I did.
For only a few cents per day you can obtain premium membership, which would let you see the entire post. This revenue is 100% used to help pay for the bandwidth charges for DSXchange.
There's a link from the home page that you can use to sign up.
For only a few cents per day you can obtain premium membership, which would let you see the entire post. This revenue is 100% used to help pay for the bandwidth charges for DSXchange.
There's a link from the home page that you can use to sign up.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Help me....To remove Broken Korean Char
help me to remove broken korean char tell me the function
Vijay