Page 1 of 1

Matching Job Does not Output any records

Posted: Mon May 07, 2012 7:55 pm
by BuddingDev
Hi,

I have Undup Matchig job in place to test 20 millions plus records.
It does not output any records to duplicate/ residual/ clerical or master link even after running for long time. However when I tried running a small subset of this file ( with 20000 records), it ran successfully in few minutes and delivered the desired records.

Can anybody tell me what could be the issue. Please note I have made my first match pass with strict blocking columns which is the combination of couple of fields.

Posted: Mon May 07, 2012 8:02 pm
by ray.wurlod
If you look at the match statistics, is there much mention of overflow blocks? You may need to look at a blocking strategy that more finely discriminates potential duplicates, so that the block size is not exceeded.

If the records don't appear on any of those links, where DO they appear? These are the only possiblities. Are you perhaps not waiting for the job to finish?

Posted: Tue May 08, 2012 7:54 am
by BuddingDev
Thanks Ray,
:D

You were right about letting the job finish first. So having a patience did take care of that issue.

Now I have one more question about undup matching.
What kind of option should I select for the following scenario.
I want to consider two records for the candidate of matching when both have field one present, if one of them have field one missing it should not be compared against each other.

Posted: Tue May 08, 2012 2:20 pm
by rjdickson
Three options you have:
1. Make the two fields blocking columns in the passes in question. Blocking columns must be populated an identical to be considered a possible match.

2. Use CRITICAL variable type on each column in question. This means that a column must be populated and identical in order to be considered a match.

3. Using a Transformer, split the data so that records with no data in either one of those columns never go into the match. Of course, this means they will NEVER match in ANY pass :D

Posted: Tue May 08, 2012 2:21 pm
by rjdickson
Three options you have:
1. Make the two fields blocking columns in the passes in question. Blocking columns must be populated an identical to be considered a possible match.

2. Use CRITICAL variable type on each column in question. This means that a column must be populated and identical in order to be considered a match.

3. Using a Transformer, split the data so that records with no data in either one of those columns never go into the match. Of course, this means they will NEVER match in ANY pass :D

Posted: Thu May 10, 2012 9:34 am
by BuddingDev
Thanks for the suggestions.