Page 1 of 1

Bug in NUM() Function ?

Posted: Wed Jan 04, 2017 9:38 am
by skathaitrooney
Hi Experts,

I was using NUM() function in a transformer wherein i faced a peculiar issue.

Below is the definition of NUM function per IBM:
Num (expression)
expression is the expression to test. If expression is a number, a numeric string, or an empty string, a value of 1 is returned. If it is a null value, null is returned; otherwise 0 is returned.

Example: NUM(35E1)

After applying Num function it is should return 0, as field holds both numeric and alphanumeric characters. but it is returning 1

According to me due to the 'E' in the expresion, it is considering the number as an exponential form.

It would work as expected if i test it using NUM(35AE1), NUM(35C1) etc.

Any workaround anyone can suggest to overcome this ?

Posted: Wed Jan 04, 2017 4:46 pm
by ds_developer
I'd suggest adding a test for the 'E' such as:

Num(nodupes.BOUNDARY_AREA) AND Index(nodupes.BOUNDARY_AREA,'E',1)

Hope this helps,
John

Posted: Thu Jan 05, 2017 7:36 am
by qt_ky
Here is the reference text quoted from the Parallel Job Developer's Guide:
Num Returns 1 if string can be converted to a number, or 0 otherwise.
- Input: string (string)
- Output: result (int32)
Scientific notation strings can be converted to numbers, so perhaps the function is actually giving correct results.

Posted: Thu Jan 05, 2017 7:59 am
by skathaitrooney
Yes indeed.. But is their any workaround? I mean in my case this is a complete string and not a number in scientific notation ?
I have tried multiple workarounds to fail.
Any suggestions are welcomed.

Posted: Thu Jan 05, 2017 8:04 am
by qt_ky
As John suggested above, you could supplement the Num() function as a workaround to test for presence of an "E" within your string, then take action accordingly. That assumes that "E" is the only exception that you see getting through, in your case.

Another alternative is to search DSXchange on using "double convert" function method, and compare before and after values. That's an interesting and elegant method, although it's got to be overkill as far as CPU cycles go.

Posted: Thu Jan 05, 2017 9:30 am
by chulett
The "double convert" is useful when you want to remove all "bad" characters from a string without specifying them, rather you list the "good" ones to keep. I'm not sure that would be all that helpful here since (as you noted) the Index will work just fine to detect the 'E' in the string. I'll tweak John's code slightly and say that it is actually a number if:

Code: Select all

Num(nodupes.BOUNDARY_AREA) AND Index(nodupes.BOUNDARY_AREA,'E',1) = 0

-or-

Num(nodupes.BOUNDARY_AREA) AND Not(Index(nodupes.BOUNDARY_AREA,'E',1))

Posted: Thu Jan 05, 2017 11:00 am
by UCDI
lots of ways to do it... if it is just the E character, num(ereplace(input)) type statement might do the trick. This might be one of the more efficient ways, at the cost of only working for the one situation. If you have to start stringing junk together like num(ereplace(upcase(... input) at some point you have iterated over the string 20 times just to do what can be done in a single iteration*


*I can't think of a single iteration in datastage using the built in but weak string processing functions. That would require a dedicated function / routine type effort, but it is doable. Datastage's string processing leaves much to be desired, to be honest.

Posted: Fri Jan 06, 2017 2:55 am
by ray.wurlod
35E1 is a numeric representation ("scientific" notation). It represents

Code: Select all

 35 * 10 ** 1

Re: Bug in NUM() Function ?

Posted: Fri Jan 06, 2017 6:42 am
by chulett
We knows. From the original post:
skathaitrooney wrote:According to me due to the 'E' in the expresion, it is considering the number as an exponential form.

Posted: Fri Jan 06, 2017 9:08 am
by Mike
I'll toss out another option since no one has mentioned it yet (there are always options):

Code: Select all

IsValid("decimal[4,0]", "35E1")
This will return 0 since the string literal 35E1 is not a valid decimal(4,0).

Code: Select all

IsValid("decimal[4,0]", 35E1)
This will return 1 since the numeric literal 35E1 is a valid decimal(4,0).

Code: Select all

IsValid("decimal[4,0]", 35E3)
This will return 0 since the numeric literal 35E3 is not a valid decimal(4,0).

Mike