Creating Patterns from business masks

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
danddmrs
Premium Member
Premium Member
Posts: 86
Joined: Fri Apr 20, 2007 12:55 pm

Creating Patterns from business masks

Post by danddmrs »

I have a requirement to validate input based on "masks" the business has set up for the field.
Business Masks look like:
ACEnnnnnnAP
BGFnnn-nnnXGF
There are a couple of hundred masks defined. My plan was to dynamically build a parameter to use with a pattern matching operator like FieldName Matches 'ACE'6N'AP':Char(253):'BGF'3N'-'3N'XGF'.

Looking for thoughts on how to build the pattern, how to separate the strings and special characters from the numerics.

Thanks
Dick
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Interesting... what other 'mask characters' will you have besides the 'n'? Will they always be the only lower-case letters in the mask?
-craig

"You can never have too many knives" -- Logan Nine Fingers
danddmrs
Premium Member
Premium Member
Posts: 86
Joined: Fri Apr 20, 2007 12:55 pm

Post by danddmrs »

There are others, always lower case, including b,c,l, and a. I don't have the doc handy right now but can post them all first thing tomorrow.
Thanks
danddmrs
Premium Member
Premium Member
Posts: 86
Joined: Fri Apr 20, 2007 12:55 pm

Post by danddmrs »

Here are all the 'mask' characters:
n - numeric
o - optional numeric
l - alpha
m - optional alpha
a - alphanumeric
b - optional alphanumeric

Always lower case. Upper case strings are just that. Optional means spaces are allowed.

Thanks.
Dick
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Here are the matching pattern-match tokens.
1N - numeric
1N:@VM:" " - optional numeric
1A - alphabetic
1A:@VM:" " - optional alphabetic

Testing for alphanumeric is more complex using pattern matching, but you can Convert() alphanumeric characters to "" and work with what you have left.

A routine may be an easier way to go, particularly if (a) the mask is passed as an argument, and (b) the non-mask characters are always upper-case and the mask characters are always lower case.

Code: Select all

FUNCTION MaskVal(aTheString,aTheMask)
* Argument validation has been omitted for clarity
* Returns 1 if aTheString matches aTheMask, 0 otherwise.
If Len(aTheString) = Len(aTheMask)
Then
   Ans = 1
   vLenString = Len(aTheString)
   For vCharPos = 1 To vLenString
      vStringChar = aTheString[vCharPos,1]
      vMaskChar = aTheMask[vCharPos,1]
      If vMaskChar = Upcase(vMaskChar)
      Then
         * Upper case mask character
         If vStringChar <> vMaskChar
         Then
            Ans = 0 ; * non-matching character
            Exit
         End
      End
      Else
         * Not upper case mask character (assume lower case)
         Begin Case
            Case vMaskChar = "n"
               If Not(vStringChar Matches "1N")
               Then 
                  Ans = 0 ; * non-numeric where numeric required
                  Exit
               End
            Case vMaskChar = "o"
                If Not(vStringChar Matches "1N") And (vStringChar <> " ")
               Then 
                  Ans = 0 ; * non-numeric where numeric required
                  Exit
               End
            Case vMaskChar = "l"
               If Not(vStringChar Matches "1A")
               Then 
                  Ans = 0 ; * non-alphabetic where alphabetic required
                  Exit
               End
            Case vMaskChar = "m"
               If Not(vStringChar Matches "1A") And (vStringChar <> " ")
               Then 
                  Ans = 0 ; * non-alphabetic where alphabetic required
                  Exit
               End
            Case vMaskChar = "a"
               If Not(vStringChar Matches "1N") And Not(vStringChar Matches "1A")
               Then 
                  Ans = 0 ; * non-alphanumeric where alphanumeric required
                  Exit
               End
           Case vMaskChar = "b"
               If Not(vStringChar Matches "1N") And Not(vStringChar Matches "1A") And (vStringChar <> " ")
               Then 
                  Ans = 0 ; * non-alphanumeric where alphanumeric required
                  Exit
               End
         End Case
      End
   Next vCharPos
End
Else
   Ans = 0  ; * lengths are not the same
End
RETURN(Ans)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
danddmrs
Premium Member
Premium Member
Posts: 86
Joined: Fri Apr 20, 2007 12:55 pm

Post by danddmrs »

Thanks for the reply. I'm not sure a routine will work because the string can fit one (or more) of many masks. Input is a couple of million rows and there are 200 potential masks to check.

Perhaps a routine is what I need to build the Pattern? Read the business masks and separate the bytes into working storage fields that break as the case changes.

Then flatten the records into one using stage variables.
danddmrs
Premium Member
Premium Member
Posts: 86
Joined: Fri Apr 20, 2007 12:55 pm

Post by danddmrs »

Notes on resolution:
Used a LOOP...UNTIL to process the Business Mask 1 byte at a time. Using Upcase I was able to keep strings together, ABC- for example, and store in "working storage". At the same time I set an indicator telling me what pattern matching token it was. When the pattern was broken started a new field/indicator.
When the last byte was processed the fields were converted depending on the indicator. field ABC- with ind U became 'ABC-', nnnnnn N, 6N, ooo Nv, 3N:@VM: and so on.
Ans = vfinal1:vfinal2:...

In the next transformer I flatten the new masks into 1 record using a stage variable and hashed file with a dummy key.

Sample of the final result:
"'A1000S'7N":Char(253):"'00011'1A4N'G'":Char(253):"'00011'2A3N'G'"

Thanks for turning on the light.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

8)
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply