Page 1 of 1

Ruleset for standardinzing Mexican Ta Payer ID/ IFE/ etc

Posted: Wed Jul 08, 2009 5:07 pm
by divstands
I want to know whether a ruleset exist for standardizing Mexican IFE/ RFC or CURP(counter part of US TAXID).

I tried searching on IBM site but am unable to find a link which gives the list of countries and the entities for which the rulesets exist.

Also, IBM personnel support is minimal, hence the query.

Posted: Wed Jul 08, 2009 8:47 pm
by ray.wurlod
From where I sit, the answer is "almost certainly not" unless you can track down someone who has already built one. If IBM had one they probably would have shipped it.

Posted: Thu Jul 09, 2009 12:06 am
by divstands
ray.wurlod wrote:From where I sit, the answer is "almost certainly not" unless you can track down someone who has already built one. If IBM had one they probably would have shipped it. ...
hmm... true. Now: is there a possibility to make a rule for

'identifying a string characater by character, analyzing it and hence correcting'

for example ,

a TAX payer id 'FGHJK4543FDGF65' has to be validated against the pattern 'NNNNAAAANNNNAAANN'
where N=Number
A = alphabet
and the string length has to be 18(as shown above)


I understand that a ruleset works when it has separators (mostly they are spaces). But what about a ruleset for a single string.

Posted: Thu Jul 09, 2009 9:03 am
by JRodriguez
DvStand,

Definitely you can develop a Rule Set to accomplish your requirement using Pattern Action Language. It work for any complex or single string. You would like to use operand substring, length, template -PICT allows test special format like your) ... among others. There are some example of how to use them in the Pattern Action Language Reference documentation that come with the product

I would suggest that you start with an existing Rule Set as a base for your development like USTAXID that validate US social Security Numbers

Posted: Thu Jul 09, 2009 3:54 pm
by divstands
JRodriguez wrote:DvStand,

Definitely you can develop a Rule Set to accomplish your requirement using Pattern Action Language. It work for any complex or single string. You would like to use operand substring, length, template -PICT allows test special format like your) ... among others. There are some example of how to use them in the Pattern Action Language Reference documentation that come with the product

I would suggest that you start with an existing Rule Set as a base for your development like USTAXID that validate US social Security Numbers
yeah.. i saw that yesterday... am in process of building the same now

Posted: Thu Jul 09, 2009 4:16 pm
by ray.wurlod
Even an "all T" word investigation might be enough for this particular requirement.

Posted: Thu Jul 09, 2009 4:20 pm
by divstands
ray.wurlod wrote:Even an "all T" word investigation might be enough for this particular requirement. ...
But the number of patterns identified is 400 out of which only one is the correct pattern(defnitiely forming 90% of the data).

The remaining 10% (however small) id very important to be cleaned, specially for a financial organization like a bank. Hence, treating of the 399 incorrect patterns(10% data) calls for a ruleset standardization.

Posted: Thu Jul 09, 2009 4:55 pm
by ray.wurlod
Fair comment. But "fix" may be difficult in all cases unless your rules are rigidly designed.