Page 1 of 1

Matching question

Posted: Mon Oct 17, 2011 3:13 pm
by sigma
Dear all

I am trying to use qualitystage to find matches within our customer master

The challenge I have is I do not have a whole lot of input to go off. All I have is a file with two columns customer name and state

For this post please assume data is specific to US only.

I have not used standardization at all... just taking raw data from the file which has the customer name and state and trying to find a match.

I narrow the blocking criteria I am taking the first 4 characters of the name from each source

So my blocking is COUNTRY, STATE and first four characters

In my example I pass one record

HARVARD UNIV, MA

It does get decent matches but will not match HARVARD BIOLOGY

Even if I want oto keep all matche scores at zero

I realize this is a not a great example but I want to understand why it would not pick up a match on HARVARD BIOLOGY

Posted: Mon Oct 17, 2011 3:21 pm
by ray.wurlod
Blocking should group the HARV values together into a set. What match rules are you using? Are there any overflow blocks (review the match statistics)? Any records in an overflow block will be treated automatically as residuals. If that's occurring, review your blocking and matching strategies.