Matching

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
jeesim
Participant
Posts: 7
Joined: Tue Aug 04, 2009 9:19 am

Matching

Post by jeesim »

I am using Unduplicate (duplicate) for Matching. The result Matching of fine. The problems I find is in some cases the matches are not working properly.
For Example
Jack Doe, 123 John Doe St , MN,32101
Jane Doe, 123 John Doe St , MN,32101 Is considered as match.
Does anyone has come across this problem? If so what is the solution.

Blocking
CityNameNYSIIS_USAREA
ZipCode_USAREA
StateAbrivation_USAREA
StreetNameNYSIIS_USADDR
HouseNumber_G1USADDR
MatchPrimaryword1NYSIIS_USNAME
MatchFirstNameNYSIIS_USNAMe

matching
HouseNumber_G1USADDR
StreetPrefixDirectional_G1USADDR
StreetPrefixType_G1USADDR
StreetName_G1USADDR
StreetSuffixDirectional_G1USADDR
StreetSuffixType_G1USADDR
ZipCode_USAREA
MatchFirstName_G1USNAME
MatchPrimaryName_G1USNAME
GenderCode_G1USNAME

Thanks
Jeesim
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Depends how your cutoffs are set. On your choices of blocking fields and matching rules, these will be quite close, but have different match weights. You have a number of choices. For example, you could make GenderCode a blocking field, or make GenderCode subject to special variable handling (criticial, missing OK), or heavily penalize differences in GenderCode through a disagreement weight override.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

For that example, you don't have a suburb/city, so CityNameNYSIIS_USAREA is blank.
If a record has a blank in a blocking field, the record is not included in matching.
Maybe CityNameRSVNDX_USAREA would work better (at least it always has a value).

Hope that helps.
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

For that example, you don't have a suburb/city, so CityNameNYSIIS_USAREA is blank.
If a record has a blank in a blocking field, the record is not included in matching.
Maybe CityNameRSVNDX_USAREA would work better (at least it always has a value).

Hope that helps.
rjdickson
Participant
Posts: 378
Joined: Mon Jun 16, 2003 5:28 am
Location: Chicago, USA
Contact:

Post by rjdickson »

You might want to consider not using any city or state in your blocking because you are already using zip code. For this pass, zip gets you to the city and state (assuming no typos).
Regards,
Robert
asyafrudin
Participant
Posts: 16
Joined: Thu Oct 21, 2010 1:40 am
Location: Indonesia
Contact:

Re: Matching

Post by asyafrudin »

jeesim wrote:I am using Unduplicate (duplicate) for Matching. The result Matching of fine. The problems I find is in some cases the matches are not working properly.
For Example
Jack Doe, 123 John Doe St , MN,32101
Jane Doe, 123 John Doe St , MN,32101 Is considered as match.
Does anyone has come across this problem? If so what is the solution.

Blocking
CityNameNYSIIS_USAREA
ZipCode_USAREA
StateAbrivation_USAREA
StreetNameNYSIIS_USADDR
HouseNumber_G1USADDR
MatchPrimaryword1NYSIIS_USNAME
MatchFirstNameNYSIIS_USNAMe

matching
HouseNumber_G1USADDR
StreetPrefixDirectional_G1USADDR
StreetPrefixType_G1USADDR
StreetName_G1USADDR
StreetSuffixDirectional_G1USADDR
StreetSuffixType_G1USADDR
ZipCode_USAREA
MatchFirstName_G1USNAME
MatchPrimaryName_G1USNAME
GenderCode_G1USNAME

Thanks
Jeesim
As ray.wurlord suggested, I'm guessing it's either your cut off values or the parameters in your match commands that might cause "Jack Doe" and "Jane Doe" to match. It would be great if you can also share the details in your match commands. Simply looking at the columns will not be enough to offer you any suggestions.
Perfection is not about making no mistakes. Perfection is about fixing your mistakes.
Post Reply