Page 1 of 1

Non-greedy pattern matching?

Posted: Thu Jul 19, 2012 3:18 am
by surfsup
Howdy,

For reasons too convoluted to explain, I need to interpret numbers written in scientfic notation and I need to do that in basic. I'd need to extract the sign, the base, the exponent sign and the exponent.

To somewhat duplicate the functionality of QS, I initially thought of using pattern matching. The main (undocumented) problem is that the pattern matching of QS is greedy and there's no mention of this in the IBM docs.

I know this is a long shot, but is there a magic way of changing the pattern matching in basic from greedy to non greedy?

Although the mask below isn't correct, it illustrates my point better than the correct one: the 0x gobbles up as much as it can to make it usefull.

Code: Select all

value: 1.234e10 mask:1N0x0N1x0N returns 4 fields: 1 | .234e1 || 2 (third field is empty).
I'd expect it to return 5 fields: 1 | . | 234 | e | 12
Edit: changed engine to server

Posted: Thu Jul 19, 2012 5:57 am
by ray.wurlod
How about being just slightly more specific with your mask? "0N'.'0N1A0N" in conjunction with the MatchField() function should work with a full (five piece) scientific notation number.

If the number is 1.234e10 then:
MatchField(InLink.TheNumber, "0N'.'0N1A0N",1) returns "1"
MatchField(InLink.TheNumber, "0N'.'0N1A0N",2) returns "."
MatchField(InLink.TheNumber, "0N'.'0N1A0N",3) returns "234"
MatchField(InLink.TheNumber, "0N'.'0N1A0N",4) returns "e"
MatchField(InLink.TheNumber, "0N'.'0N1A0N",5) returns "10"

Posted: Thu Jul 19, 2012 6:49 am
by surfsup
ray.wurlod wrote:How about being just slightly more specific with your mask? "0N'.'0N1A0N" in conjunction with the MatchField() function should work with a full (five piece) scientific notation number.

If the numbe ...
I can't see your full point and I may be missing something, but it doesn't seem as straightforward as writing the one correct greedy pattern.

Some numbers don't come in with a decimal point (i.e 1E+10) and there are 18 variations I can think of:

Code: Select all

(sign){number(s)}((decimal sign){number(s)}){exponent}(sign){number(s)}

Where () is optional, {} is mandatory.
There are 3 possibilities for each sign (+, - and absent) and 2 for the decimal part (decimal sign + number following - present or absent).

The one non-greedy string that would match all would be:

Code: Select all

(0X)(1N0N)(0X)(0N)(1X)(0X)(1N0N)
(sign){number}((decimal sign){number}){exponent}(sign){number}
This is assuming the interpretor would know to stop matching the X token when the first N token is found. In this scenario the MatchToken would work like a charm.

I could write 18 greedy patterns starting from the least to the most inclusive, but I think that would be harder to maintain (and uglier) then disassembling the string in "non-pattern" fashion.

Posted: Thu Jul 19, 2012 3:14 pm
by ray.wurlod
Matches operator and MatchField() function allow for multiple patterns in a value mark delimited list.