Problem with name and address match

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
jeesim
Participant
Posts: 7
Joined: Tue Aug 04, 2009 9:19 am

Problem with name and address match

Post by jeesim »

Hi,

I am using the unduplicate stage to do the name(first, middle and last ) and address matching. I am getting 99% of the matches as per my requirement .

I have a problem when people who are residing in the same address and thier first name starts with the same initail. In this case quality stage considers them a match. In case we have only First initial of the one person then other persons name starts with the same initail then we like it to considered as match.

Example :
Jean Doe 123 Main Street , Warren, NJ, 09088
James Doe 123 Main Street , Warren, NJ, 09088

Quality stage considers them as match.

J Doe 123 Main Street , Warren, NJ, 09088
John Doe 123 Main Street , Warren, NJ, 09088

Quality stage considers them as match and we are fine with this result.

The column that is used in blocking is
CitynameNYSIIS_USAREA
ZIPCODE_USAREA
StreetName_NYSIIS_USAADDR
MatchPrimarywordNYSIIS_USNAME
MatchFirstnameNYSISS_USNAME
HouseNumber_USADDR



The coulmn that is used in Matching

HouseNumber_USADDR
StreetPrefixDirectional_USADDR
StreetPrefixtype_USADDR
StreetName_USADDR
StreetSuffixDirectional_USADDR
StreetSuffixtype_USADDR
UnitType_USADDR
UnitValue_USADDR
ZipCode_USAREA
Zip4AddonCode_USAREA
MatchFirstName_USNAME
MatchPrimaryName_USNAME
NameGeneration_USNAME

Thanks
jeesim
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Create another field containing the initial and match on that as CHAR.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

What is the cutoff?
If you haven't set one, it will default to 0 and as long as the addresses match then there's probably more than enough to be considered a match, regardless of who lives there.
jeesim
Participant
Posts: 7
Joined: Tue Aug 04, 2009 9:19 am

Post by jeesim »

The Cutoff is set to 58. The wieghtage value in most of these matches are about 68.
We need to consider the first name , as we are consolidating at customer level , not at household level.

Has anyone come across this problem?

Thanks
Jeesim
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

How is the pattern I+ being handled by your rule set?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jeesim
Participant
Posts: 7
Joined: Tue Aug 04, 2009 9:19 am

Post by jeesim »

The I+ is handled as First Name - I , Last name - +.

My problem is not with First initial. My issue is the way Quality Stage handles the Names starting with the same Initials.
Quality stage considers James ,Jerry, Jane, etc as a same individual as they have same address. They are different customers or patients in a business scenario.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That is only true if you match on them. If you match on first initial, every name with the same first initial will be "the same".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

What score do you get when you match a record against itself?
Maybe your cutoff needs to be a little higher and a you have a separate field to use in a separate pass for matching initials to names.

What scores do you get for the individual field when you match Jerry and Jacob?
Do you see the disagreement score and it's still high enough to get past the cutoff, or do you actually see the agreement score? What type of match are you using for the comparison for that field? What is your Comparison Threshold set at?
dsqspro
Premium Member
Premium Member
Posts: 20
Joined: Wed Apr 15, 2009 7:01 am

Post by dsqspro »

Create three char of first name and put heavy negative wt if three letters don't match,
Post Reply