Token Class: Table has duplicate entries

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
bireswar.goswami
Participant
Posts: 33
Joined: Wed Sep 03, 2008 5:48 am
Location: Bangalore

Token Class: Table has duplicate entries

Post by bireswar.goswami »

Hi All,

I have 2 parallel jobs to standardize person name and organization name separately. Both the jobs will get an input file and will create an output file with the standerdize data. I am using USNAME standerdization rule. Now while running both the jobs, the jobs are getting aborted and givening:
<JOBNAME.rule name>,0:Token Class.Table has duplicate entries
<JOBNAME.rule name>,0:Error Occured in file ........ .CLS file
and so on.

It would be great if someone can let me know why this error is orruring and how can I resolve the problem.

Thanks in advance.

Bireswar.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Look in your classification table (the .CLS file for the rule set). It appears that there is at least one duplicate entry therein. The file should be sorted. Why not process it with a DataStage job to identify the duplicate(s)?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

As Ray said, there is a duplicate entry in the CLS file of your ruleset ie: the value in the first column shows up more than once.
You can check it in by opening up the ruleset and clicking the Test button.

CLS file problems will be reported over the dialog before you get a chance to type anything. It will also tell you what the duplicate value is and what line it's on, IIRC.
It may be time consuming to fix if you have a bunch of them, but easy enough if you only have 1 or 2.

IF it's not just a duplicate entry and you actually _need_ it twice for different token types, you might have to do something like the USADDR ruleset and call it a multi-use token and then write a SUB to try and reclassify it based on the context.

Hope this helps. Apologies if Ray already had this further down in the Premium part...
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

No I didn't, so thanks for the extra info. It's interesting that there is a duplicate name in the USNAME.CLS file, which theoretically at least is read only in version 8. Hmm. Will need to check mine - being in Australia I don't get to use the USNAME rule set all that much.

Bireswar, are you using any overrides? Particularly classification overrides? If so, might one of those have been duplicated?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vairus
Participant
Posts: 52
Joined: Thu Feb 07, 2008 8:02 am
Location: Johannesburg

Post by vairus »

Hi ,

Open the ruleset and click TEST. It gives you which value is duplicated in the classification table...
vairamuthu
Post Reply