Page 1 of 1

matching job

Posted: Tue Jan 13, 2009 2:20 am
by suneelchallagali
Hi Guys,

The main goal of this job is comparing source and target.i have created the match specification for reference match.

All the columns are varchar

In blocking variable iam using SSN
In matching variable i am using firstname,SSN,age

i have set the m=.9 and u=.1

consider the first name as uncert and set parameters1 as 900.
SSN as char
age as char

i am getting the exact match but i am not the getting 90% match for first name.

i have set clerical, match,duplicates as 0.

Please can you help me out regarding this.

Thank you,
suneel

Posted: Tue Jan 13, 2009 5:09 am
by ray.wurlod
900 is NOT 90%. 900 is "must match exactly". Read the manual.

Posted: Tue Jan 13, 2009 8:19 am
by suneelchallagali
Hi ray,

i had given 900 only just for understand i had mention as 90%.even i tried with 850 also but i am not getting the match

Posted: Tue Jan 13, 2009 9:51 am
by JRodriguez
suneelchallagali

If you are using SSN as a blocking variable you don't need to include it as a matching variable. The process will compare only records with same SSN anyway

Having a cut off as 0 doesn't mean that records having a composite weight greater than 0 are matches. You should play around with the test environment to find out at which level cut off values are real matches - the graphic shows those values - then set your cut off with that value

Thansk

Julio R

Posted: Tue Jan 13, 2009 10:11 am
by suneelchallagali
Thank you JRodriguez

Posted: Tue Jan 13, 2009 5:41 pm
by suneelchallagali
i am getting this error

REFERENCE_MATCH,0: Fatal error from object MatJoinOp, code 1
REFERENCE_MATCH,0: The runLocally() of the operator failed.
REFERENCE_MATCH,0: Operator terminated abnormally: runLocally did not return APT_StatusO
main_program: APT_PMsectionLeader(3, node3), player 15 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 16 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 20 - Unexpected exit status 1.

Actually from the output of reference match i am considering only exact match and residuals as i required only those two feilds.

Following fields are used in blocking and matching stage

In blocking stage
birthdate char
ssn char

matching stage:

firstname uncert set parameter =850
middle name uncert set prameter =900
last name uncert set parameter=850

actually when i ran this job with reference data it is working fine for few runs but not for all runs. i am getting error message as mention above.

so please any one can help me out!!!!!!!!!

Posted: Wed Jan 14, 2009 10:05 am
by JRodriguez
suneelchallagali

If you are looking to get only exact match on birthdate and ssn then do not use any matching fields the process will dump into residual all records that don't match on those two fields

Regarding the error looks like the processes - players on different node- are running out of resources - normally memory or temp space.

One way that you can test if you are running out of resources is executing your process in sequential mode ....


Julio Rodriguez

Posted: Wed Jan 14, 2009 10:24 am
by suneelchallagali
Hi rodriguez,


i have to match the records with are exact match as well as 90% match . For 90% match on first_name,last_name,midle_name i have set the probablities as m=.09 and u=.01 as set the parameter 1 as 850 but i am not getting the match even though they are 90% match.

Type for First_name,Middle_name,Last_name consider as uncert.

Posted: Wed Jan 14, 2009 10:37 am
by JRodriguez
suneelchallagali,

Just to be clear, you need exact matches on ssn and birthdate and matches were if the ssn or birthday are missing or are different then you would like to get all records that match 90% base on First_name, Middle_name and Last_name?

If so you should have two passes. One having only birthdate and ssn as blocking fields this pass will give you all matches and anther pass with a different set of blocking variables like NYSIIS of first name and NYSIIS of last name and matching variables First_name, Middle_name and Last_name using Uncert param1 as 880 ( Play around to set the cut off value)

Thanks

how to handle Null in blocking and matcing feilds

Posted: Tue Jan 20, 2009 2:35 pm
by suneelchallagali
Hi

i am using date of birthday and ssn numbere as blocking fields

first_name.last_name,midle_name ass matching fields

How to handle if i get null records in blocking records as well as in matching records.

If the source data and reference data containing ssn field as null but all other records are having value and they are same that records must go to match dataset.

so please can you help me out!!!!!!!!

Posted: Tue Jan 20, 2009 3:07 pm
by ray.wurlod
Is null genuine null or simply "missing"? There are ways to handle missing values, among them setting up VARTYPEs. Or you can convert the null values to something else upstream of the Match stage. To what values are your cutoffs set?

Posted: Tue Jan 20, 2009 3:16 pm
by suneelchallagali
it is missing value i means blank values for particular fields

Re: how to handle Null in blocking and matcing feilds

Posted: Thu Jan 22, 2009 10:57 am
by JRodriguez
suneelchallagali,

All records having nulls values in blocking fields are skipped in the current pass, but become available for other passes. Just add other passes with different blocking fields to process those records


Records will become matches, duplicates or residual depending on the composite weight and the cut off values for the pass. A null value in a matching field by default will add a zero(0) weight to the composite weight - you can use a different default if you want- so you should to set the match, duplicate and clerical cutt off to proper values

You also should use a vartype - CRITICAL MISSING OK - for fields in the matching commands having nulls to specify that should be consider in matching process, if not they will become residuals



Also you need to specity that columns containing nulls values need special handling using a vartype like "CRITICAL MISSINGOK", if not those records become residuals

Posted: Thu Jan 22, 2009 11:28 am
by suneelchallagali
Thank alot for the help JRodriguez!!!