Pattern Action File Code Validation

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
JRodriguez
Premium Member
Premium Member
Posts: 425
Joined: Sat Nov 19, 2005 9:26 am
Location: New York City
Contact:

Post by JRodriguez »

Well ... you are in the right track ...

Take a look at variable temp, is never been cleared, so if the repeating token one [1] that you are using in all your actions you really mean [1], [2], [3] and [4], then you will have:

The content of {exterior} = [1] concatenated with [2]
{Interior} = [1], [2] and [3]
{calle} = Entire patterns value

Also the data dictionary fields will contain the "\" string that you are appending... I guess that's not what you want to accomplish ...

We will be more helpful if you post a sample data and your expected results...
Julio Rodriguez
ETL Developer by choice

"Sure we have lots of reasons for being rude - But no excuses
divstands
Participant
Posts: 128
Joined: Wed Jun 03, 2009 9:48 am

Post by divstands »

JRodriguez wrote:Well ... you are in the right track ...

Take a look at variable temp, is never been cleared, so if the repeating token one [1] that you are using in all your actions you really mean [1], [2], [3] and [4], then you will have:

The content of {exterior} = [1] concatenated with [2]
{Interior} = [1], [2] and [3]
{calle} = Entire patterns value

Also the data dictionary fields will contain the "" string that you are appending... I guess that's not what you want to accomplish ...

We will be more helpful if you post a sample data and your expected results...
The unhandled data is like
54 36 78 Pedro

And expected Dictionary fields entry should be like:
Calle: XYZ 54
Exterior : XYZ 36
Interior : XYZ 78
Asentamiento: XYZ Pedro

where XYZ= any string(NULL/NOT NULL) as a result of regular standardization according to the ruleset.
Divya
JRodriguez
Premium Member
Premium Member
Posts: 425
Joined: Sat Nov 19, 2005 9:26 am
Location: New York City
Contact:

Post by JRodriguez »

It's difficult to say without having the pattern action file for your Rule Set and without any knowledge of Mexican Addresses. Normally UNHANDLED PATTERN refer to the entire pattern and looks like in your case part of the record has been standardized and part of it has been identified like "additional Address" or "unhandled data" and you want to just add the data to the proper dictionary fields. Yes?

As a general approach you would like to prepare the data for the Rule Set, using PREP rule set previous to apply the final rule set. Sometimes further preparation is need it to get good results .. Does the MEX Rule Set has a PREP Rule Set? Can you mimic the logic in previous stages?

For last I would add a subroutine with all my pattern actions to fix the data issues, and I would call it when in the current Pattern Action flow the rule set is moving the data around ... that wait you don't need to process the data twice .. just be aware that the subroutine will be called by each row of data

By the way, you would like to add an extra dictionary field ("ProcessedByPattern") to log which action pattern was used to process the record. This save a lot of debug time ...
Julio Rodriguez
ETL Developer by choice

"Sure we have lots of reasons for being rude - But no excuses
JRodriguez
Premium Member
Premium Member
Posts: 425
Joined: Sat Nov 19, 2005 9:26 am
Location: New York City
Contact:

Post by JRodriguez »

Is a bit different, what I mean is:

- Add a subroutine to the action pattern file, probably at the end
- In the subroutine add all your new pattern actions
- Call the subroutine when in the current Pattern File they are determining the piece of data that will be mark as "unhandled"

When a new release of the MEX Rule Set is available from IBM you need to incorporate your custom code into the new pattern Action file, grouping your pattern actions in a subroutine will make this task easier than having them in different places in the pattern action file

You will need to add the field to the Dictionary file, and for each new pattern that you are processing just copy the pattern to the field (See Below) after saving the pattern file just do a normal provision of the rule set

^ | ^ | ^ | +
COPY "^^^+" {ProcessedByPattern}
CONCAT \" \" temp
COPY [1] temp2
CONCAT temp2 temp
COPY temp {Calle}
CONCAT \" \" temp
COPY [1] temp2
CONCAT temp2 temp
COPY temp {Exterior}
CONCAT \" \" temp
COPY [1] temp2
CONCAT temp2 temp
COPY temp {Interior}
CONCAT \" \" temp
COPY [1] temp2
CONCAT temp2 temp
COPY temp {Calle}
RETYPE [1] 0
RETYPE [2] 0
RETYPE [3] 0
RETYPE [4] 0
RETURN
Julio Rodriguez
ETL Developer by choice

"Sure we have lots of reasons for being rude - But no excuses
divstands
Participant
Posts: 128
Joined: Wed Jun 03, 2009 9:48 am

Post by divstands »

Also, for the code

M | + | Q | ?
COPY [1] {Manzana}
COPY [2] {Manzana}
COPY [3] {ExceptionData}
COPY [4] {Asentamiento}
RETYPE [1] 0
RETYPE [2] 0
RETYPE [3] 0
RETYPE [4] 0
RETURN


i expect the fields {Manzana} to have M +
Is the code fien or Should i add CONCAT function to do the expected task?
Divya
divstands
Participant
Posts: 128
Joined: Wed Jun 03, 2009 9:48 am

Post by divstands »

Also, is it better to use COPY_S at all the places in the last post, instead of COPY
Divya
JRodriguez
Premium Member
Premium Member
Posts: 425
Joined: Sat Nov 19, 2005 9:26 am
Location: New York City
Contact:

Post by JRodriguez »

Well ... you would like to:

M | + | Q | ?
COPY [1] temp
CONCAT " " temp
CONCAT [2] temp
COPY temp {Manzana}
COPY [3] {ExceptionData}
COPY_S [4] {Asentamiento}
RETYPE [1] 0
RETYPE [2] 0
RETYPE [3] 0
RETYPE [4] 0
RETURN

Only [4] using COPY_S to preserve the space in between, the rest of tokens are sigle words anyway
Julio Rodriguez
ETL Developer by choice

"Sure we have lots of reasons for being rude - But no excuses
Post Reply