Page 1 of 1

Matching

Posted: Fri Jul 08, 2011 8:04 am
by jeesim
I am using Unduplicate (duplicate) for Matching. The result Matching of fine. The problems I find is in some cases the matches are not working properly.
For Example
Jack Doe, 123 John Doe St , MN,32101
Jane Doe, 123 John Doe St , MN,32101 Is considered as match.
Does anyone has come across this problem? If so what is the solution.

Blocking
CityNameNYSIIS_USAREA
ZipCode_USAREA
StateAbrivation_USAREA
StreetNameNYSIIS_USADDR
HouseNumber_G1USADDR
MatchPrimaryword1NYSIIS_USNAME
MatchFirstNameNYSIIS_USNAMe

matching
HouseNumber_G1USADDR
StreetPrefixDirectional_G1USADDR
StreetPrefixType_G1USADDR
StreetName_G1USADDR
StreetSuffixDirectional_G1USADDR
StreetSuffixType_G1USADDR
ZipCode_USAREA
MatchFirstName_G1USNAME
MatchPrimaryName_G1USNAME
GenderCode_G1USNAME

Thanks
Jeesim

Posted: Fri Jul 08, 2011 7:01 pm
by ray.wurlod
Depends how your cutoffs are set. On your choices of blocking fields and matching rules, these will be quite close, but have different match weights. You have a number of choices. For example, you could make GenderCode a blocking field, or make GenderCode subject to special variable handling (criticial, missing OK), or heavily penalize differences in GenderCode through a disagreement weight override.

Posted: Sun Jul 10, 2011 6:58 pm
by stuartjvnorton
For that example, you don't have a suburb/city, so CityNameNYSIIS_USAREA is blank.
If a record has a blank in a blocking field, the record is not included in matching.
Maybe CityNameRSVNDX_USAREA would work better (at least it always has a value).

Hope that helps.

Posted: Sun Jul 10, 2011 7:00 pm
by stuartjvnorton
For that example, you don't have a suburb/city, so CityNameNYSIIS_USAREA is blank.
If a record has a blank in a blocking field, the record is not included in matching.
Maybe CityNameRSVNDX_USAREA would work better (at least it always has a value).

Hope that helps.

Posted: Mon Jul 11, 2011 6:18 am
by rjdickson
You might want to consider not using any city or state in your blocking because you are already using zip code. For this pass, zip gets you to the city and state (assuming no typos).

Re: Matching

Posted: Tue Jul 12, 2011 1:19 am
by asyafrudin
jeesim wrote:I am using Unduplicate (duplicate) for Matching. The result Matching of fine. The problems I find is in some cases the matches are not working properly.
For Example
Jack Doe, 123 John Doe St , MN,32101
Jane Doe, 123 John Doe St , MN,32101 Is considered as match.
Does anyone has come across this problem? If so what is the solution.

Blocking
CityNameNYSIIS_USAREA
ZipCode_USAREA
StateAbrivation_USAREA
StreetNameNYSIIS_USADDR
HouseNumber_G1USADDR
MatchPrimaryword1NYSIIS_USNAME
MatchFirstNameNYSIIS_USNAMe

matching
HouseNumber_G1USADDR
StreetPrefixDirectional_G1USADDR
StreetPrefixType_G1USADDR
StreetName_G1USADDR
StreetSuffixDirectional_G1USADDR
StreetSuffixType_G1USADDR
ZipCode_USAREA
MatchFirstName_G1USNAME
MatchPrimaryName_G1USNAME
GenderCode_G1USNAME

Thanks
Jeesim
As ray.wurlord suggested, I'm guessing it's either your cut off values or the parameters in your match commands that might cause "Jack Doe" and "Jane Doe" to match. It would be great if you can also share the details in your match commands. Simply looking at the columns will not be enough to offer you any suggestions.