Is there any function in Basic similar to String tokenizer in java or strtok in c/c++ ?
Im trying to parse a list of text values ( like a regular language sentence) and replace all Non Alpha numeric characters with a single Non Alphanumeric character(per occurence) .
Ereplace() is very specific. Convert() requires me to supply all special chars in advance and specify as many replacement characters. I'm looking at something more simpler - if such a function exists .
String Tokenizer
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 1735
- Joined: Thu Mar 01, 2007 5:44 am
- Location: Troy, MI
why not use convert multiple times something similar to
convert (convert('1234567890','',col1),str(<replacement char/s>,len(convert('1234567890','',col1))),col1)
Sorry, can't validate the syntax at the moment. Its been long since I used it.
convert (convert('1234567890','',col1),str(<replacement char/s>,len(convert('1234567890','',col1))),col1)
Sorry, can't validate the syntax at the moment. Its been long since I used it.
Priyadarshi Kunal
Genius may have its limitations, but stupidity is not thus handicapped.![Wink :wink:](./images/smilies/icon_wink.gif)
Genius may have its limitations, but stupidity is not thus handicapped.
![Wink :wink:](./images/smilies/icon_wink.gif)
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
There's no function out of the box, but it would be very easy to create one.
Assuming that the string is already space-separated there is no real need to tokenise - if you think there is, please provide a more exact specification.
Assuming that the string is already space-separated there is no real need to tokenise - if you think there is, please provide a more exact specification.
Code: Select all
FUNCTION ReplaceNonAlphaNumerics(aString,aReplaceChar)
AlphaNumerics = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
REM Add lower case alphabetic characters to string if required.
Ans = Convert(Convert(Alphanumerics, "", aString), "", aString)
RETURN(Ans)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thanks for validating my fears. Im going to continue using convert().
Effectively we are tokenizing a regular language sentence into a set of delimited 'words' and pivoting them , looking up with one table to eliminate 'useless' words & text noise ( prepositions, articles etc) and scan significant words against another keyword lookup table ( which can keep growing) and do an English keyword search . Each sentence is from a col called "reason desc" which stores free form text data. So for each reason id , we assign a weight based on keyword and sum it up per reason id and write back a total weight score that says how a sentence is potentially meaningful or not for feeding to a text analytics engine.
Effectively we are tokenizing a regular language sentence into a set of delimited 'words' and pivoting them , looking up with one table to eliminate 'useless' words & text noise ( prepositions, articles etc) and scan significant words against another keyword lookup table ( which can keep growing) and do an English keyword search . Each sentence is from a col called "reason desc" which stores free form text data. So for each reason id , we assign a weight based on keyword and sum it up per reason id and write back a total weight score that says how a sentence is potentially meaningful or not for feeding to a text analytics engine.
-
- Premium Member
- Posts: 1735
- Joined: Thu Mar 01, 2007 5:44 am
- Location: Troy, MI
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Once you've effected the conversion, change the space characters to Char(10), write to a text file with no formatting, and read back from the text file with line terminator specified.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.