Page 1 of 1

Regarding Quality stage Match Concept

Posted: Wed Jan 31, 2007 4:04 am
by thesri
In Match Stage how can i assign weight for every field and what is composite weight..Is there is some samples to help me....Thanks in advance.....

Posted: Wed Jan 31, 2007 4:37 am
by ray.wurlod
Welcome aboard. :D

While you CAN assign weight to every field in the Match stage, you don't want to. Let QualityStage calculate the weights based on the frequency distributions in the data, which you may have reported upon in an investigation.

There are agreement and disagreement weights calculated. You can bias these by altering them based upon external knowledge (typically about the general population compared to the sample upon which the frequencies were calculated). The agreement and disagreement weights are calculated for every field (unless excluded specifically from analysis) and reflect the "information content" - how rare the value is in its domain.

The agreement and disagreement weights are summed across the non-excluded fields in each record to yield the aggregate weight. It is the aggregate weights that determine which are masters, which are duplicates and which are residuals during the match.

You can set cutoff points that govern the decision whether a pair of records is a match or not, based on the aggregate weight of the putative duplicate.

Posted: Wed Feb 07, 2007 10:16 am
by Alexander
In my opinion, the weights based on the frequency distribuitions should only be used on fields with well known values, not over fields wich accept free text, because the frequency tables will not cover all the range, and the result can fall into extreme situations.

If you assign weight to every field you will lose a potencial tool of QS, but on the other hand you will control the weight give to wich record. And that can be a great advantage.

You can give fixed weigth as first step, and then adjust it to use frequency fields :idea:.

Good luke!!!