Removing non-alphanumeric characters using Convert function

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
abc123
Premium Member
Premium Member
Posts: 605
Joined: Fri Aug 25, 2006 8:24 am

Removing non-alphanumeric characters using Convert function

Post by abc123 »

I am trying to remove all non-alphanumeric characters from a string. Two questions:
1)How do you do:

Convert("""", MyString)?
This does not compile. Gives a validation error.

2)Is there any way to do it in PX without writing a routine? I went through all the posts here but I don't see a PX solution.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Convert() requires three arguments.
Try something like

Code: Select all

Convert("~!@#$%^&*()_+=-`,./;[]\|}{:?>< ", "", in.Col)
That should be close.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
abc123
Premium Member
Premium Member
Posts: 605
Joined: Fri Aug 25, 2006 8:24 am

Post by abc123 »

DSGuru2B, thank you for your response. Yes, I am aware that Convert requires 3 arguments. I made a mistake in my first post. I was already doing the exact same thing that you posted. The error happens because of the 2 double quotes.

If I put in a double quote as follows:

Convert("~!@#$%^&*()_+=-`,./;[]\|}{:?>< "", "", MyString)

it gives a validation error. I tried putting in 2 double quotes together so that the compiler translates into one but it also gives the same validation error.

Also, I was looking for a solution that would also take care of all non-alphanumeric characters including non-printable characters.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There are three quote characters available. Surround the string with one of the others. Concatenate if you must.

Code: Select all

Convert('"':"'", "", InLink.TheString)
The first piece is a double quote character surrounded by single quote characters, the second piece is a single quote character surrounded by double quote characters.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
abc123
Premium Member
Premium Member
Posts: 605
Joined: Fri Aug 25, 2006 8:24 am

Post by abc123 »

Thanks Ray. It worked. Can you tell me if there is a way in PX jobs to only keep alphanumeric characters and strip everything else including non-printable characters without using a routine and Ascii codes?
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

The Convert() function will work to retain only alphanumeric characters. But for non-printable characters, I think you will have to go the routine route.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You will need a routine or a BASIC Transformer stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Post by ady »

ray.wurlod wrote:You will need a routine or a BASIC Transformer stage.
Ray , How can we do it in basic transformer without a routine ?
JoshGeorge
Participant
Posts: 612
Joined: Thu May 03, 2007 4:59 am
Location: Melbourne

Post by JoshGeorge »

Try this: Put an external filter and use unix command in your job to strip non alphanumeric characters
Last edited by JoshGeorge on Mon May 07, 2007 2:25 am, edited 1 time in total.
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

ady wrote: Ray , How can we do it in basic transformer without a routine ?
Use OCONV/ICONV with MCP format to convert all nonprintable characters to a dot. Then use the above convert statement to extract only alphabets and numers.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Post Reply