How to remove space, coma, period from incpmeing data

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ashok
Participant
Posts: 43
Joined: Tue Jun 22, 2004 3:04 pm

How to remove space, coma, period from incpmeing data

Post by ashok »

help needed, How to remove characters in data containing like space, underscore, coma, period ect... and collect next letters
example
1) John_Big is in comeing data and I need out put as JohnB
2) Tim,Jr the out put should be TimJr
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Take a look through the Transforms that come with the product. For example, there is one called DIGITS that removes all non-numerics from a string. I'm sure there's some sort of alpha equivalent as well.

Just checked and there is - it's called LETTERS.

Looking at them, you'll see that they simply invoke Oconv with different conversion codes. You could do the same thing yourself, but which is clearer - "Oconv(YourField,"MCA")" or "LETTERS(YourField)"? :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

You should also look at QualityStage that handles free text standardisation better then DataStage. Part of the problem is keeping track of all the rules and combination of names. Are you always getting "junior" as Jr or are you also getting Jnr, Junior, J or Jun.
kura
Participant
Posts: 21
Joined: Sat Mar 20, 2004 3:43 pm

Re: How to remove space, coma, period from incpmeing data

Post by kura »

[quote="ashok"]help needed, How to remove characters in data containing like space, underscore, coma, period ect... and collect next letters
example
1) John_Big is in comeing data and I need out put as JohnB
2) Tim,Jr the out put should be TimJr[/quote]


Did you try to use change or ereplace function from transformer?

change(<input_column>,"/","") replaces all "/" with empty string.

I hope this hepls you. if want to replce both space and "/" then try this one
change(change(<input_column>,"/","")," ","")

Thanks
Kura
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Removing the non character fields is the easy bit, there are a dozen ways to do it, parsing the text left over is the hard bit. If someone has three names how do you tell which is the first name, middle name and surname?

Mary Ann Flanders :- could be Mary-Ann Flanders or Mary Ann-Flanders or Mary Ann Flanders. No way of telling the difference between first, middle and last name. You need to find out whether these extra "," and "_" characters have business rules behind them to help you parse the text they are delimiting. Eg. does a comma always indicate a name suffix, does an underscore always precede a surname. It is likely that the "-" character indicates two names that are really one so you don't want to remove those from the first name (Mary-AnnF) but you do want to remove them from a capitalised surname (MaryA).
gh_amitava
Participant
Posts: 75
Joined: Tue May 13, 2003 4:14 am
Location: California
Contact:

Post by gh_amitava »

Hi,

If you are designing jobs in PX environment, don't use any basic function which are used in server version like DIGIT etc. It will minimize the efficiency of your program. Use Convert function in PX Transformer. The syntax is available in the help.

Regards
Amitava
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The original poster specifed server job, and posted in the server-job forum. That answer was, therefore, not entirely helpful.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply