Page 1 of 1

multiple passes question

Posted: Wed Sep 28, 2011 3:47 pm
by vijaydasari
I have a match specification with two passes , match type I am using is Unduplicate Dependent . In first pass I am doing address details match , based on address match I am seeing master and duplicates records.

second pass is customer id match , if there are different address with same customer id , then objective is to treat single customer information.

problem is master records having weights in first pass are becoming residual records to second pass , even though they have same customer id.

how to make them to master and duplicate pair.

Posted: Wed Sep 28, 2011 3:51 pm
by ray.wurlod
Can you please supply more details about the blocking fields and match rules used for each pass?

Posted: Wed Sep 28, 2011 4:07 pm
by vijaydasari
pass1

block commadsprimaryname_USNAME
matchfirstname_USNAME
zipcode_USAREA
housenumber_usaddr
streetname_usaddr

match commands

address1
state code
postal code

pass2

block commands

Customer id

no match commands

Posted: Wed Sep 28, 2011 6:49 pm
by stuartjvnorton
No match fields in the second pass give it no way to get a score other than 0. And if the cutoffs are also zero, then it will be a residual instead of a dupe.

So either add a matching field (eg the customer ID again) to manufacture a score or reduce the cutoffs below zero to force dupes.

Posted: Wed Sep 28, 2011 9:11 pm
by vijaydasari
Thanks you very much for your suggestion . I tried to have match command as customer id , but still same issue.

I will try the option cutoffs below zero.

Re: multiple passes question

Posted: Thu Sep 29, 2011 2:02 am
by BI-RMA
vijaydasari wrote:I have a match specification with two passes , match type I am using is Unduplicate Dependent . problem is master records having weights in first pass are becoming residual records to second pass , even though they have same customer id.
Hi Vijay,

if You use Unduplicate Dependent, data will only run through the second match-pass when match-pass one was unsuccessful. Duplicates are removed from further match-considerations after the first pass. They donot become residual in the second pass.

If You want your second match-pass to contain all records, You have to use Unduplicate Independent.

Posted: Thu Sep 29, 2011 8:41 am
by vijaydasari
I included customer ID in match commands and gave agreement & disagreement weight as 10 and -15 respectively.

Also cutoff values section match and clerical fields accept values from 0 to 999999.99.

I tried with different values for match and clerical field , but still facing same issue.

Posted: Thu Sep 29, 2011 8:42 am
by vijaydasari
I included customer ID in match commands and gave agreement & disagreement weight as 10 and -15 respectively.

Also cutoff values section match and clerical fields accept values from 0 to 999999.99.

I tried with different values for match and clerical field , but still facing same issue.

Re: multiple passes question

Posted: Thu Sep 29, 2011 8:48 am
by vijaydasari
Hi BI-RMA,

In first pass I have 5 master records (record type MP) have setids 1,2,4,6 & 9 . these five records came as residual records in 2nd pass with same set id number but record type changed to RA.

my objective is to make one of the record to MP and rest to DA.

Re: multiple passes question

Posted: Thu Sep 29, 2011 9:49 am
by BI-RMA
vijaydasari wrote:Hi BI-RMA,

In first pass I have 5 master records (record type MP) have setids 1,2,4,6 & 9 . these five records came as residual records in 2nd pass with same set id number but record type changed to RA.

my objective is to make one of the record to MP and rest to DA.
Hi Vijay,

The more I think of it the more it seems to me that running your data through a second match-pass may be a bad idea altogether.

What you want to achieve is basically to get a common key for groups of records sharing the same customer number and this is completely unrelated to your first match-pass, right?

You can achieve that in DataStage by sorting by CustomerId and evaluating key-change.

Posted: Thu Sep 29, 2011 6:36 pm
by stuartjvnorton
Completely agree with BI_RMA: that's a much better fit than trying to shoe-horn it through a manufactured match pass.

Posted: Tue Oct 04, 2011 10:37 am
by vijaydasari
Thank you very much for responses. Issue resolved by changing match type to Unduplicate independent