How to remove non-ascii char from a string
Moderators: chulett, rschirm, roy
How to remove non-ascii char from a string
Can anyone give me an idea of stripping non-ascii chars from a string. For e.g. removing tab chars.
If you are using custom buildops, you can take advantage of some of the C/C++ built-in functions. We wrote a function called stringToGraph that takes a string as input, loops through each character and uses the C macro isgraph(c) to test if character c is a graphic character (letter, number, type of space, punctuation, etc.). All non-graphic characters are converted to spaces.
There are a number of other C-builtin macros that can be used as well:
Note: We include the function in our buildop, but I have heard that you can also create custom functions like these and make them accessible within a Transformer (parallel only, not basic). I don't know how to do this, but perhaps someone else does?
There are a number of other C-builtin macros that can be used as well:
- isalpha(c) - c is a letter
isupper(c) - c is an uppercase letter
islower(c) - c is a lowercase letter
isdigit(c) - c is a digit
isalnum(c) - c is a letter or a digit (alpha-numeric)
isxdigit(c) - c is a hexadecimal digit
isspace(c) - c is a whitespace character
ispunct(c) - c is a punctuation character
isprint(c) - c is a printable character
isgraph(c) - c is printable, but not a space
iscntrl(c) - c is a control character
isascii(c) - c is an ASCII code
Note: We include the function in our buildop, but I have heard that you can also create custom functions like these and make them accessible within a Transformer (parallel only, not basic). I don't know how to do this, but perhaps someone else does?
Remember, we use it in a buildop (Build/Logic/Definitions). I think there are ways to create your own C/C++ functions to be referenced from Transforms, but I do not know how.
[/code]
Code: Select all
// Convert any non-graphical characters to ' '
APT_String stringToGraph(APT_String);
APT_String stringToGraph(APT_String str) {
APT_String new_str;
int isNumeric=1;
int done=0;
int i=0, max=0;
// Test input string - if chars Graphical,
// replace characters with a ' '.
max = str.length();
const char* ptr = str.terminatedContent();
for (i = 0; i < max; i++) {
if ( !isgraph(ptr[i]) )
{
new_str = new_str + ' ';
}
else
{
new_str = new_str + ptr[i];
}
}
return new_str;
}