Validating email id field
Moderators: chulett, rschirm, roy
Validating email id field
hi...
in my flat file i m getting a email id field..i want to validate this field before inserting it into table..the email id should be in the format
xyz...@xyz...com or xyz...@xyz...co.xy.i need to write some function in the transform stage..so can u plz suggest me what code i need to write and what functions i need to use
thanks in advance
in my flat file i m getting a email id field..i want to validate this field before inserting it into table..the email id should be in the format
xyz...@xyz...com or xyz...@xyz...co.xy.i need to write some function in the transform stage..so can u plz suggest me what code i need to write and what functions i need to use
thanks in advance
Use the Matches function in the transformer. You dont need a routine. Something like
If you get a 1 that means email is in the format xyz@xyz.com else if you get a 0 that means its not in xyz@xyz.com
Code: Select all
If in.Link Matches '...@...com' then @True else @False
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
In continuation to DSguru2B's solution
To handle other than .com extensions, to directly passing the valid Email Id's and specifying that the Email Id is invalid if it does not match your format.
Code: Select all
If in.Link Matches '...@...' then in.Link else 'Invalid EmailId'
Narasimha Kade
Finding answers is simple, all you need to do is come up with the correct questions.
Finding answers is simple, all you need to do is come up with the correct questions.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Solutions given thus far do not verify that xyz at the beginning of the address and to the left of the @ symbol is the same as the xyz immediately to the right of the @ symbol. Is this, in fact, part of the requirement?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Sorry but wouldn't it be better if you knew these emails are real? Maybe someone has done this. If you email then you should get a return saying undeliverable. Has anybody figured out how to process this? This is taking this issue to the next level. If you have an email address that is kim@hotmail.com does not mean that it is a real email address. If you email to this address and it fails then you need to take it out of your database.
Maybe Walter does this on this web site. If anybody has done this then jump in and share. I think this is a solution a lot of us could benefit from because we all have email address to scrub. No WAVE or CASS files for this?
Maybe Walter does this on this web site. If anybody has done this then jump in and share. I think this is a solution a lot of us could benefit from because we all have email address to scrub. No WAVE or CASS files for this?
Mamu Kim
Hmm. Interesting. Lets think about the logic and how to verifty the validity. But i am sure this is above the requirement of the OP, but a very interesting task. Dont know if its possible though without actually emailing the person, resulting in a lot of non-sense mail to the mailing addresses everytime a validity check is run. It can make a lot of people mad. But then again, this is my take on it. I might be wrong.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Not sure I'd want to be actually emailing anything to see if it's good, there's got to be a better way. One sec... I put 'validating email addresses' into Google and got 1.4 million hits and all kinds of references to packages to do this. Also some fun stuff like this that may help someone architect a solution.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Dont know if that could be done with DS. But here is what i could come up. Contains validation checks only. Validation checks as follows
Heres the routine.
I would love for you guys to try and break the code. This way it could be enhanced. Not the best thing around, but this is what i could come up with for now.
- -Check for Zero Length/NULL Email Address
-Makes sure Length of email address is greater than 6 as the shortest email address to be found on the net is of length 6.
-Checks for space in email address.
-Check for exactly 1 @ symbol.
-Makes ure @ symbol is not present in the begining or end of email
-Checks for valid extension (.COM.EDU.GOV.NET.BIZ.ORG.TV) and ISO Country Codes.
Heres the routine.
Code: Select all
FUNCTION IsValidEmail(Arg1)
Equate RoutineName To "IsValidEmail"
in.Email=Trim(Arg1)
emailLength=Len(in.Email)
Ans=@FALSE
***Check for Zero Length
If emailLength = 0 OR ISNULL(in.Email) = 1
Then
RETURN(Ans)
END
***email address is shorter than 6 characters is impossible
If emailLength < 6
Then
RETURN(Ans)
End
***Check for exactly 1 @ symbol
symbolCount=COUNT(in.Email, "@")
If (symbolCount = 0 OR symbolCount > 1 )
Then
RETURN(Ans)
End
***Check for space in email address
spacePos = INDEX(in.Email, " ",1)
If (spacePos > 0)
Then
RETURN(Ans)
End
***@ symbol should not be in the begining or end of email
symbolPosition=INDEX(in.Email, "@", 1)
If (symbolPosition =1 OR symbolPosition = emailLength)
Then
RETURN(Ans)
End
***Valid extensions including ISO country codes
mailExt = UPCASE(".":Field(in.Email,".",Count(in.Email,".")+1))
ISO_CountryCodes = ".COM.EDU.GOV.NET.BIZ.ORG.TV"
ISO_CountryCodes := ".AF.AL.DZ.As.AD.AO.AI.AQ.AG.AP.AR.AM.AW.AU.AT.AZ.BS.BH.BD.BB.BY"
ISO_CountryCodes := ".BE.BZ.BJ.BM.BT.BO.BA.BW.BV.BR.IO.BN.BG.BF.MM.BI.KH.CM.CA.CV.KY"
ISO_CountryCodes := ".CF.TD.CL.CN.CX.CC.CO.KM.CG.CD.CK.CR.CI.HR.CU.CY.CZ.DK.DJ.DM.DO"
ISO_CountryCodes := ".TP.EC.EG.SV.GQ.ER.EE.ET.FK.FO.FJ.FI.CS.SU.FR.FX.GF.PF.TF.GA.GM.GE.DE"
ISO_CountryCodes := ".GH.GI.GB.GR.GL.GD.GP.GU.GT.GN.GW.GY.HT.HM.HN.HK.HU.IS.IN.ID.IR.IQ"
ISO_CountryCodes := ".IE.IL.IT.JM.JP.JO.KZ.KE.KI.KW.KG.LA.LV.LB.LS.LR.LY.LI.LT.LU.MO.MK.MG"
ISO_CountryCodes := ".MW.MY.MV.ML.MT.MH.MQ.MR.MU.YT.MX.FM.MD.MC.MN.MS.MA.MZ.NA"
ISO_CountryCodes := ".NR.NP.NL.AN.NT.NC.NZ.NI.NE.NG.NU.NF.KP.MP.NO.OM.PK.PW.PA.PG.PY"
ISO_CountryCodes := ".PE.PH.PN.PL.PT.PR.QA.RE.RO.RU.RW.GS.SH.KN.LC.PM.ST.VC.SM.SA.SN.SC"
ISO_CountryCodes := ".SL.SG.SK.SI.SB.SO.ZA.KR.ES.LK.SD.SR.SJ.SZ.SE.CH.SY.TJ.TW.TZ.TH.TG.TK"
ISO_CountryCodes := ".TO.TT.TN.TR.TM.TC.TV.UG.UA.AE.UK.US.UY.UM.UZ.VU.VA.VE.VN.VG.VI"
ISO_CountryCodes := ".WF.WS.EH.YE.YU.ZR.ZM.ZW"
i = INDEX(ISO_CountryCodes, mailExt, 1)
If (i <= 0)
Then
RETURN(Ans)
End
***Its finally valid
Ans = @TRUE
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Syntactic validation (which can be done) is not the same thing as certification (which, as far as I am aware, can only be done by trial and error, and still requires the destination mail server to be up).
There is nothing in DataStage or QualityStage for performing email address certification.
Does anyone know the email address rules for IPv6?
DSGuru2B, .INFO and .NAME are also valid extensions, and that's without researching further. Your checks for "@" could have been combined, just using the Index() function. Similarly your checks for length could have been combined. Having checked for one of the country extensions such as ".uk" you still need to check that the previous element is a valid extension code. Some countries use ".co" rather than ".com", for example "jolt.co.uk". So you can get "amp.com.au" but "amp.co.nz".
Depending on how you write your new revised function, would it properly handle "billg@microsoft.com.net"? (And, indeed, is this a valid address?) Time to refer to the standards documents!
There is nothing in DataStage or QualityStage for performing email address certification.
Does anyone know the email address rules for IPv6?
DSGuru2B, .INFO and .NAME are also valid extensions, and that's without researching further. Your checks for "@" could have been combined, just using the Index() function. Similarly your checks for length could have been combined. Having checked for one of the country extensions such as ".uk" you still need to check that the previous element is a valid extension code. Some countries use ".co" rather than ".com", for example "jolt.co.uk". So you can get "amp.com.au" but "amp.co.nz".
Depending on how you write your new revised function, would it properly handle "billg@microsoft.com.net"? (And, indeed, is this a valid address?) Time to refer to the standards documents!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Ill definately combine the checks. I'll also add the .NAME and .INFO. Tell me this, isnt there any way where we can open a socket via Basic coding. I was thinking of doing something in those lines to check for a valid DSN. Even if i want to something at the OS level, some companies have there ping functionality blocked, i know mine did, so that wont cut it. Google wasnt much help either except for psuedocode.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: