Name Standardization issue 7.5 vs 8x

dodda · Post by **dodda** » Tue Dec 01, 2009 11:10 am

Hi all,

I had worked a little on Qualitystage 7.5( not an expert).
while standardizing names when i use literals (ZQ) i got Matchfirstname and soundex results, but when i do the same in 8 its shows unhandled pattern.

example:

ZQ MARTIN ZQ K ZQ SMITH
ZQ VENKAT ZQ K ZQ SMITH

The above two strings when analayzed using USNAME rule set parsed correctly ( no unhandled pattern) in 7.5 version. but the same is not true in 8x version.

can someone guide me what needs to be changed in USNAME rule sets of v8 in order to accept non US names.

Thanks in advance

JRodriguez · Post by **JRodriguez** » Wed Dec 02, 2009 10:02 am

Dodda,

One way to make it work is using classification overrides.

Just double click on the rule set then the Rule management windows will pop up:

Rules Management tool --> Overrides --> Classification

Classify all your names base on your investigation frequency report

dodda · Post by **dodda** » Thu Dec 03, 2009 7:55 am

JRodriguez wrote:Dodda,

One way to make it work is using classification overrides.

Just double click on the rule set then the Rule management windows will pop up:

Rules Management tool --> Overrides --> Classification

Classify all your names base on your investigation frequency report

Thanks Julio for your reply.
i am pretty sure that i have not done any overrides in my previous project on 7.5. i am not sure if classifying overrides will solve my problem as we would be anonymous names. if we have to classify each name, how about new names that are not there in the system? i need to a direction to solve this problem as i would be using standardized names further for matching, if name turns out to be unhandled pattern then matching fails per logic.

Thanks for your suggestion and i am looking for more.

thanks all

ray.wurlod · Post by **ray.wurlod** » Thu Dec 03, 2009 2:15 pm

Somewhere, somehow, you are going to have to classify some names as first names. You probably also need to parse the pattern + | I | + into first name, middle name and primary name buckets.

JRodriguez · Post by **JRodriguez** » Thu Dec 03, 2009 3:41 pm

Dodda,

Notice that with unhandled pattern overrides you will be solving not individual names but patterns so with one pattern overrides rule you will be solving any unclassified name that fall into that pattern. A good trick is to add the most common pattern coming in your data (Investigation job) as unhandled pattern overrides rules (Proactive)

If I know which names are probably be coming in myr data ... I would classify them in front instead of waiting for the tool to handled it as unhandled pattern

I don't remember using just ZQ literal to identify anonymous names ... as a matter of fact the USNAME Rule will set them to null ( You can see it in the Pattern Action File). I remember seeing orphans literals(ZQ) in preprocessor USPREP output after using metadata delimiters(ZQ Domain ZQ)

dodda · Post by **dodda** » Fri Dec 04, 2009 6:00 pm

Thanks guys for all your inputs. I am going to try few things suggested here during this weekend and on Monday and will let you know the results!

Just my obervation:
I believe that +|+ format is handled in GBNAME rule set. when i send Vekant K Dodda to GBNAME its actually parsing the name. so i am planning to combine patterns in GBNAME and USNAME after some analysis.

I'll get back in touch with you with my results.

Thanks all for your valuble suggestions

dodda · Post by **dodda** » Tue Dec 08, 2009 2:54 pm

Thanks again all for your suggestions.

I've investigated the names and found the most common patterns and added them to overrides, thus solving the problem.

Thre are too many names to classify nd for sure we know that we will get new names , so I separated the most common patterns and added them as overrides.

Thanks guys !