Removing Control and Non-ASCII characters
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 38
- Joined: Fri Apr 22, 2005 6:07 am
Removing Control and Non-ASCII characters
Hi All,
Following is the code being used in Ab Initio:
out.pd_into_ovrdrft_cd :: string_replace(re_replace(in.pd_into_od_cd, "[\001-\037|\177-\377]", " "), char_string(0), " " );
My understanding is Within the column pd_into_od_cd, any occurrence of a value between \001 and \037 (control characters) or between \177 and \377 (non-ASCII characters or non-printable characters) should be replaced finally by 1 space
Can we use same code in a convert function? Or is there other way of doing this?
Thanks in advance.
-Amit
Following is the code being used in Ab Initio:
out.pd_into_ovrdrft_cd :: string_replace(re_replace(in.pd_into_od_cd, "[\001-\037|\177-\377]", " "), char_string(0), " " );
My understanding is Within the column pd_into_od_cd, any occurrence of a value between \001 and \037 (control characters) or between \177 and \377 (non-ASCII characters or non-printable characters) should be replaced finally by 1 space
Can we use same code in a convert function? Or is there other way of doing this?
Thanks in advance.
-Amit
The Ab Initio code looks like a direct interlude to the unix tr command; which you can use as a filter in the source sequential stage. The syntax of the DataStage CONVERT function is very different, you explicitly specify two strings, and each position value in one string is convert to the value in the same position in the other string. So you would need to explicitly list those characters you wish to convert in one string and then have another string of the same lenght filled with spaces.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Convert() does not support ranges or octal representation, but you can set up a stage variable containing a string of all the characters to be converted, and a single Convert() function can then be used to replace them with " " from a stage variable containing the same number of space characters as there are unwanted characters in the first string.
Or you could use a UNIX command such as tr (perhaps in an External Filter stage) which can handle regular expressions as in your example.
Code: Select all
Convert(svUnwantedChars, svSpaces, InLink.TheString)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 38
- Joined: Fri Apr 22, 2005 6:07 am
-
- Charter Member
- Posts: 822
- Joined: Sat Sep 17, 2005 5:25 pm
- Location: USA