m-prob and u-prob in QS

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
kkumardatastage
Participant
Posts: 84
Joined: Sat Jul 19, 2008 8:50 am

m-prob and u-prob in QS

Post by kkumardatastage »

Hi

Please can any one help me,
1)what is meant by m-prob and U-prob in quality stage match designer, I know the default value we need to give but what will this effect to data. Please if u got some example that will be great.
2)what is the minimum and maximum Cutoff we can use for Match and Clericals and
3)what will be the Parm 1, I know the Parm value is the exact match data should be 900 for (Names) but what will be the Parm value for DOB(can u give me some example)
4)Match designer Places with Agree Weight and Disagree Weight but is there any chance to get the Weight to be displaces the weights for each and every individual coloumns(there is a statistics shows u the percentage but i need to check the Scores for the columns)

Please can you help me in this queations.

Thanks
k
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Tell you what, you research in the QualityStage User Guide the answers to your questions and post back here, and we'll help to clarify any remaining doubts.

When you post back, don't address your question to U (one of our posters) specifically - U doesn't check in all that often. Note, perhaps, that the second person personal pronoun in English is spelled "you", and that we strive for a professional standard of written English here on DSXchange. There is no need for SMS-style abbreviations; you are not limited to 140 characters.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
JRodriguez
Premium Member
Premium Member
Posts: 425
Joined: Sat Nov 19, 2005 9:26 am
Location: New York City
Contact:

Post by JRodriguez »

m and u-Prob

- m-Prob reflects the error rate for the column. In Layman's terms this is the probability that two columns that should match, ending matching

- u- Prob The u probability is the probability that the column agrees provided that the record pair does not match. In Layman's terms this is the probability that two columns that shouldn't match, ending matching



Match and Clerical Cut off values

Zero values should be the minimun, negative values doesn't make sense. The maximun depend on your data and how many token the record contain. Each token in the record add/substract value to the composite weight depending on the m and u-Prob, so a fix maximun cut off value doesn't exist

Normally I used zero(0) as Match, duplicate and clerical cut off value as the initial value to start researching the final cut off values for the data / match specification. The ultimate set of cut off values should be taken from the histogram which help you to determine, graphically, and with sample data at which composite weight level the records don't match anymore ....you need to have a very good knowledge of the data or get somebody from the business side to help you with the task. Same process is used for clericals


Param1

- Param1 have different meaning in diferent Match Comparison, so it will depend on the match algorism that you are using

If you are matching names or material description you should allows space for mispelling, typos, transposition and other common errors. When you are matching DOB you don't allow that kind of error, so you used a different comparison
Julio Rodriguez
ETL Developer by choice

"Sure we have lots of reasons for being rude - But no excuses
Post Reply