Page 1 of 1

Find Strings in a String Field

Posted: Thu Mar 17, 2011 2:27 pm
by swapnilverma
How can I find multiple strings in a String column.

in sql we can use --

select xyz from abc
where UPPER(DESC) like '%MANGO%ORANGE%'


Read some of the old post here and found that I can use index function to find a string in string field.

Is there any specific function available to search multiple strings ?


In the mean time i can use index function twice to achive my logic..

but what if i have more strings to be searched in one field. More if else ??

Re: Find Strings in a String Field

Posted: Thu Mar 17, 2011 2:56 pm
by jseclen
Hi swapnilverma,

The index function returns the position of a substring within a string, for your case:

this select:

select xyz from abc
WHERE UPPER (DESC) like '% MANGO% ORANGE%'

you can use the function as follows:

Index (UPPER (DESC) 'MANGO', 1) > 0 and Index (UPPER (DESC) 'ORANGE', 1) > 0

and return the records that matches both substrings

Posted: Thu Mar 17, 2011 6:47 pm
by swapnilverma
Thanks Saludos,

But I would still like to know if we have any function available to search multiple strings in one string column ...

Posted: Thu Mar 17, 2011 7:20 pm
by chulett
The only thing I can think of that would do that is matches but the problem is I don't believe it is a valid PX function. You'd have to use a BASIC Transformer or a Server Shared Container to bring that functionality into your job from what I recall.

Posted: Thu Mar 17, 2011 7:34 pm
by ray.wurlod
There's nothing out of the box in the parallel Transformer stage for searching for multiple string. Matches (in a BASIC Transformer stage) would do it. Or you could write your own parallel routine. A probably slower option would be an External Filter stage, that could use regular expressions to report a match/non-match result.

Posted: Thu Mar 17, 2011 7:37 pm
by ray.wurlod
Information Analyzer (8.1.2 or later) has a matches_regex data rule available. It would be interesting to inspect the osh script that is generated from a data rule job that uses this, to see which operator is invoked. Alas, I'm not currently in a position to do that.

Posted: Fri Mar 18, 2011 7:42 am
by battaliou
Use the dcount function to establish how many occurances there are of your substring e.g. dcount(''dog cat dog cow dog'',"cat") will return 2

Posted: Fri Mar 18, 2011 8:52 am
by jwiles
That's not quite right---there's only one cat.

DCount counts the number of delimited fields in the first argument based on the delimiter specified in the second argument. There are two fields defined by the delimiter "cat".

I believe the easiest options to implement are the two Index() example given earlier or the Matches function in a BASIC transformer.

Regards,

Posted: Fri Mar 18, 2011 9:00 am
by DSguru2B
You can always do it in unix without much stretching. Yes? no? maybe so?

Posted: Fri Mar 18, 2011 9:15 am
by chulett
Oooo ooo! [thrusts hand into the air] Pick me! Pick me!

awk?

Posted: Fri Mar 18, 2011 9:25 am
by jwiles
Somehow, I almost always overlook awk when thinking of a parallel job ;)

Posted: Fri Mar 18, 2011 12:14 pm
by DSguru2B
AWK yes, i was thinking more like a simple grep :)

Posted: Fri Mar 18, 2011 12:50 pm
by chulett
D'oh! [hangs head in shame]

Posted: Fri Mar 18, 2011 1:56 pm
by swapnilverma
thx guys ...

i have used index function multiple times for now ...

I just came to that this requirement is for "quick fix " and not for the long run ...

thanks again!