Spelling corrections in Address

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
pklcnu
Premium Member
Premium Member
Posts: 50
Joined: Wed Aug 06, 2008 4:39 pm

Spelling corrections in Address

Post by pklcnu »

Dear Experts

I have three questions regarding the address cleansing

1) I have been given reference table ( from Postal Department Netherlands) which contains the standard postcodes, streetnames etc.

Is it possible to use this reference table in Quality Stage for address standardization , if so how ?

2) Is it possible to correct the streetnames if there are any spelling mistakes through above approach. If so how to do this ?

3) If I have data like "J.R. Accounts" which is not correct but the correct one is "J.R.S. Accounts" . Is it possible to cleanse this type of data ?


The software that we have doesn't have ruleset for Netherlands and I have been asked to use the reference table as mentioned above.

Any help and ideas will be much appreciated.

Many thanks in advance
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

1) Yes, but it would be easier once you build NLADDR and NLAREA (and maybe NLPREP) rule sets.

2) Typically identify potential duplicates through matching on phonetic equivalent and as much other information as you have available, and specify that the reference table is the accurate one.

3) This is your NLNAME rule set (though any "name" rule set will probably work). Again the technique is matching on phonetic equivalent and as much other information as you have available. Now, though, you need a Survivorship rule to specify which one is correct.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Re: Spelling corrections in Address

Post by stuartjvnorton »

pklcnu wrote:Dear Experts

I have three questions regarding the address cleansing

1) I have been given reference table ( from Postal Department Netherlands) which contains the standard postcodes, streetnames etc.

Is it possible to use this reference table in Quality Stage for address standardization , if so how ?

2) Is it possible to correct the streetnames if there are any spelling mistakes through above approach. If so how to do this ?

3) If I have data like "J.R. Accounts" which is not correct but the correct one is "J.R.S. Accounts" . Is it possible to cleanse this type of data ?


The software that we have doesn't have ruleset for Netherlands and I have been asked to use the reference table as mentioned above.

Any help and ideas will be much appreciated.

Many thanks in advance

1) You need to create your own ruleset(s) in order to do proper parse and standardisation in QS. You could use these reference tables in your classification file.
Maybe you can modify DEAREA, DEADDR if the basic structure is close enough to German addresses (and I have no idea so don't quote me).

2) Correction and standardisation are 2 rather different things.
It's easy to say Avenue = Av = Ave for standard terms (and the parse part helps to tell you if it's a standard term or just a word), but if you start changing things like the street name, then it should be because your reference files give you a way to know, or make one hell of a guess.
Depends what you are getting in your "etc". ;-)

3) If you have a reference set that has the full list of correct values, then you could try to match against it to find the most likely option.

Cleanse, correct, stan, everything you're talking about is based on understanding context and having good enough reference data to either know for sure or make the best guess we can.
If you don't have good enough reference data, chances are your guesses won't be good enough either. ;-)

Hope this helps. :-)
JoshGeorge
Participant
Posts: 612
Joined: Thu May 03, 2007 4:59 am
Location: Melbourne

Post by JoshGeorge »

Have you tried the MNS and then doing a refernce to the given tables? For spelling mistake correction a soundex matchig might be an easy way.
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
pklcnu
Premium Member
Premium Member
Posts: 50
Joined: Wed Aug 06, 2008 4:39 pm

Post by pklcnu »

Thanks for your suggestions.....will let you know the outcome soon.........
JoshGeorge wrote:Have you tried the MNS and then doing a refernce to the given tables? For spelling mistake correction a soundex matchig might be an easy way.
Post Reply