Index Position of Last Occurrence of String

jonesv · Post by **jonesv** » Fri Jul 13, 2012 10:20 am

Hi

I am looking for a way to return the index position of the last occurrence of a substring in a given string. The requirement I have is to confirm that an email address has at least one period (.) after the @ character in the address. But there could be multiple dots before the @ and after the @. For example, these are all valid addresses:

J.o.h.n.D.o.e@gmail.com
JohnDoe@name.co.uk

etc.

Any advice would be appreciated.

Thanks

ankursaxena.2003 · Post by **ankursaxena.2003** » Fri Jul 13, 2012 10:30 am

Try to get the string after @ in a Stage Variable and then try to find period.
For example,

SV1 ---> @name.co.uk
SV2 ---> Try to find period in SV1 using Index using Index(SV1,'.',1)

If SV2 has any value other than 0, it means that it has period in it. And if SV2 has value 0, it means that there is no period.

jonesv · Post by **jonesv** » Fri Jul 13, 2012 10:31 am

Now why didn't I think of that :D

Makes perfect sense. Thanks!

ankursaxena.2003 · Post by **ankursaxena.2003** » Fri Jul 13, 2012 10:34 am

Sometimes, the problem is easy but we think too hard and make it complex....
:D

jonesv · Post by **jonesv** » Fri Jul 13, 2012 10:49 am

Very true

chulett · Post by **chulett** » Fri Jul 13, 2012 11:42 am

ankursaxena.2003 wrote:Try to get the string after @ in a Stage Variable and then try to find period.

Just curious what you (jonesv) are using for this - field, I assume.

Also curious how far you need to take the checking - what if is doesn't have an "@" in it? Or has more than one? Doing these kind of validations "manually" can take on a life of their own.

jonesv · Post by **jonesv** » Fri Jul 13, 2012 12:38 pm

Just curious what you (jonesv) are using for this - field, I assume.

I don't understand your question. To extract everything after the "@" (after already validating that it does have an "@") I have this derivation is the stage var:

Code: Select all

Right(Trim(lnk_TestDataRead.EMAIL_ID),Len(Trim(lnk_TestDataRead.EMAIL_ID))-Index(Trim(lnk_TestDataRead.EMAIL_ID),"@",1)+1)

Also curious how far you need to take the checking - what if is doesn't have an "@" in it? Or has more than one? Doing these kind of validations "manually" can take on a life of their own.

Understood. But I am only a lowly developer. I am given a spec, and I code to that spec. But I hear ya, and I have explained to the powers that be that there is no way you could code EVERY validation for a valid email address. But they still want these two checks:

1) each email addy has to have an "@"
2) there has to be at least one period after the "@".

C'est la vie.

chulett · Post by **chulett** » Fri Jul 13, 2012 1:20 pm

Sorry, by "field" I meant the Field() function which makes pulling sections out of a "delimited string" quite simple. So, something like:

Code: Select all

Field(lnk_TestDataRead.EMAIL_ID,"@",2,99)

Would get every "field" starting with the second one in the column, considering the "@" as the field delimiter. You wouldn't have the actual "@" in the result but your subsequent tests would still be valid.

And as you noted, my comment on the "life of its own" was from being in the same situation a number of years ago - "validate the email addresses". Umm yah, sure... at a very high level maybe.

jonesv · Post by **jonesv** » Fri Jul 13, 2012 1:38 pm

That makes my derivation much simpler. Thanks!

ray.wurlod · Post by **ray.wurlod** » Fri Jul 13, 2012 4:27 pm

Make the assumption that it's a valid email address, and therefore contains only one "@". If you want to check this, search for the second occurrence of "@".

Code: Select all

If Index(InLink.EmailAddr, "@", 2) Then "invalid address" Else Index(Field(InLink.EmailAddr,"@",2,1),".",1)

Kryt0n · Post by **Kryt0n** » Sun Jul 15, 2012 4:46 pm

Or that DCount(field, '@') = 1...

ray.wurlod · Post by **ray.wurlod** » Mon Jul 16, 2012 3:24 am

Kryt0n wrote:Or that DCount(field, '@') = 1...

No, sorry, that won't give the required answer - it will only be true if there are NO @ characters in the string.

Further, it's less efficient to use DCount() than Index() because DCount() must process the entire string whereas Index() can stop as soon as its condition is met.

jonesv · Post by **jonesv** » Mon Jul 16, 2012 11:19 am

Understood. Thanks everyone!

Kryt0n · Post by **Kryt0n** » Mon Jul 16, 2012 4:50 pm

I actually meant count rather than dcount (had dcount on my mind due to other thoughts). As for efficiency, I see minimal difference unless the odds on having 2 or more @ symbols in the string is high, since you are always searching for 2 in your index query, it is likely to scan the full string when the majority of records have a valid address.

On top of that an index for the second occurrence with NO @ symbol will attempt to run the Index/Field functions decreasing efficiency