Stripping Non Printing Characters

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ShaneMuir
Premium Member
Premium Member
Posts: 508
Joined: Tue Jun 15, 2004 5:00 am
Location: London

Stripping Non Printing Characters

Post by ShaneMuir »

Hi everybody

I am sure this is quite an easy one for all of you but here goes anyway:

I wish to strip all the non printing characters from an input string. I am aware of the routine Oconv(expression, MCP) but I do not wish to replace the offending character with a period, rather a character of my choice.

I know it can be done as it I have seen it done before, but I cannot remember exactly how. :oops:

Thank you in Advance
Shane
crouse
Charter Member
Charter Member
Posts: 204
Joined: Sun Oct 05, 2003 12:59 pm
Contact:

Post by crouse »

ereplace(Oconv(StringWithNonPrintingCharacters, "MCP"),".",MyNewCharacterOfChoice,0,1) will work.

If you might have periods in your string to begin with, do an inner ereplace to change them to a unique character first, then another outer ereplace to change them back to period after converting the non-printing characters.

For execution speed and reusability, do this all in one line and create a transform for it.

For maintainablity and reusability, do it in several lines in a routine.

-Craig
Craig Rouse
Griffin Resouces, Inc
www.griffinresources.com
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard! Long way from Brisbane!

About the only way to strip all non-printing characters is to create a Routine that processes each character.

Oconv with "MCA", "MC/A", "MCN", "MC/N" and "MCP" are probably too specific for your needs.

Code: Select all

FUNCTION RemoveNonPrintingCharacters(TheString)
Ans = ""
CharCount = Len(TheString)
For cpos = 1 To CharCount
   TheChar = Seq(TheString[cpos,1])
   If TheChar >= 32 And TheChar <= 127
   Then
      Ans := TheChar
   End
Next cpos
RETURN(Ans)
If you're working with a non-ASCII character set, such as a European character set that includes accented characters, your definition of "printable" may need to change.

If NLS is enabled, prefer UniSeq to Seq.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ShaneMuir
Premium Member
Premium Member
Posts: 508
Joined: Tue Jun 15, 2004 5:00 am
Location: London

Post by ShaneMuir »

Thanks for that guys. Most appreciated.

Shane

PS: Yes Ray I am a long way from Brisbane :D
hassers
Participant
Posts: 14
Joined: Thu Dec 11, 2003 11:34 am
Location: Chester, UK

Alternat Routine Code

Post by hassers »

Very similar to Rays Code, do initial check for null string being received and stip trailing spaces:

Input variable SourceString


Ans = ''
lenvar= len(trimb(SourceString))
if lenvar < 1 then Ans = ' '

For i = 1 to lenvar step 1

char1 = SourceString[i,1]
nchar = seq(char1)

if nchar <32 or nchar>127 then
Ans := ' '
End Else
Ans := char1
End

Next i
Thanks

Steve
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

:!:
Trailing spaces may be valid data.

NULL would be OK in my routine, because Len(@NULL) is zero (go on, try it). However, my routine would return "" in that case, which may not be the desired result.

:idea:
If you're going to write a completely bullet-proof routine, you need also to check that the incoming arguments are not in an unassigned state. Depending on how you like to cast the logic, use either the Assigned or the UnAssigned function to test.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply