Extract Alphanumeric characters and space only

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
rameshrr3
Premium Member
Premium Member
Posts: 609
Joined: Mon May 10, 2004 3:32 am
Location: BRENTWOOD, TN

Extract Alphanumeric characters and space only

Post by rameshrr3 »

I have a requirement to extract only the alphanumeric data from a string. Non alphanumeric data ((!@#$%^&*().etc) should be replaced with empty String. I havent been able to efectively use Oconv() for this. I wrote a routine which seems to work.

One more piece of info : No nulls are expected in input.Leading and trailing spaces should be trimmed in output.Embedded spaces should be spared!

Will this code work? Is there any defect in my approach? Any help is appreciated
Arg1 : input data field

Arg1 Output

"Str^&% ng" --> "Str ng"
" Strin^G" --> "StrinG"

Code: Select all

L = Len(Trim(Arg1))
A = Trim(Arg1)
B = ""

For i=1 To L Step 1;
 Ac = Seq(A[i,1])
 
 Begin Case
   Case (Ac > 47 And Ac < 58) 
    d = A[i,1];
   Case (Ac > 64 And Ac < 91)
    d = A[i,1];
   Case (Ac > 96 And Ac < 123)
    d = A[i,1];
   Case (Ac = 32)
    d = A[i,1];
   Case @True
    d = "";
 End Case
B = B:d;
Next i 


Ans = B
Sorry for the lack of inline comments!
wnogalski
Charter Member
Charter Member
Posts: 54
Joined: Thu Jan 06, 2005 10:49 am
Location: Warsaw

Post by wnogalski »

Have You tried the Convert function - it should satisfy all Your requirements and You won't have to write a routine:

Code: Select all

Convert("((!@#$%^&*().", "", Link.TheSting)
Regards,
Wojciech Nogalski
rameshrr3
Premium Member
Premium Member
Posts: 609
Joined: Mon May 10, 2004 3:32 am
Location: BRENTWOOD, TN

Post by rameshrr3 »

Thanks, but the characters that i need to suppress are more that those .Anything thats not an English letter or a number should be suppressed. Input will always be ascii text only.( not unicode)
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The major flaw is that you have not considered all possible characters. A more general, and more efficient routine, might be

Code: Select all

FUNCTION AlphaSpace(TheString)
* Returns alphabetic and space characters only
Ans = ""
If IsNull(TheString) 
Then
   Ans = @NULL
End
Else
   CharCount = Len(TheString)
   For i = 1 To CharCount
      TheChar = TheString[i,1]
      If Alpha(TheChar) Or (TheChar = " ")
      Then
         Ans := TheChar
      End
   Next i
End

RETURN(Ans)
This will work with "high range ASCII" (accented characters) and with NLS enabled with any character set. In the latter case, alphabetic characters are those defined as alphabetic in the CHARACTER category of the currently selected locale.

A final point: DataStage BASIC does NOT use trailing semi-colons. In DataStage BASIC a semi-colon separates multiple statements on the same line - a trailing statement separates the statement to its left from the empty statement to its right, which is why the code compiles and works.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rameshrr3
Premium Member
Premium Member
Posts: 609
Joined: Mon May 10, 2004 3:32 am
Location: BRENTWOOD, TN

Post by rameshrr3 »

Thanks Ray for your suggestion.
Post Reply