So far, my parallel routines that take strings have always just accepted char* pointers, and that's been fine. I've assumed the strings coming in are null-terminated ASCII.
But, isn't all DataStage data processed internally as Unicode? Is it flattening it all down to ASCII to pass to a parallel routine? How would I write a px routine that processes Unicode text?
Parallel routines and Unicode
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 1044
- Joined: Wed Sep 29, 2004 3:30 am
- Location: Nottingham, UK
- Contact:
Parallel routines and Unicode
Phil Hibbs | Capgemini
Technical Consultant
Technical Consultant
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Premium Member
- Posts: 1044
- Joined: Wed Sep 29, 2004 3:30 am
- Location: Nottingham, UK
- Contact:
NLS is enabled, so does that mean the strings are in UTF-8 format? I just ran a test, and indeed, passing in a string that has a non-ASCII character seems ok, but I suspect my single-character replacement method might make a mistake with multi-byte characters. I need to read up on handling UTF-8 in C.ray.wurlod wrote:With NLS enabled it should always be Unicode internally. With NLS not installed I understand it is ASCII internally.
Phil Hibbs | Capgemini
Technical Consultant
Technical Consultant
-
- Premium Member
- Posts: 1044
- Joined: Wed Sep 29, 2004 3:30 am
- Location: Nottingham, UK
- Contact:
It looks like that would convert the string to UTF-8 before passing it to my routine, but it is being passed as UTF8 already. What I want is, an easy way to process UTF-8 strings within my C or C++ routine. I'm looking at http://site.icu-project.org/ at the moment, when I have time and access.
Phil Hibbs | Capgemini
Technical Consultant
Technical Consultant