Routine to validate US State Codes

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
bapajju
Participant
Posts: 82
Joined: Wed Nov 19, 2003 11:58 pm

Routine to validate US State Codes

Post by bapajju »

Hello,

In a routine I want to validate if the code is one of the valid US State codes . If yes then return 1 else return 0. I have used the below mentioned routine code to get the result. But I do not think this is an efficient and compact code. Can some one help on this.


Ans =' '
If Arg1 = 'AR' or Arg1 = 'AZ' or Arg1 = 'CA' or Arg1 = 'CO' or Arg1 = 'CT'or Arg1 = 'FL' or Arg1 = 'GA' or Arg1 = 'IA' or Arg1 = 'IL' or Arg1 = 'IN'or Arg1 = 'KS' or Arg1 = 'LA' or Arg1 = 'MA' or Arg1 = 'MI' or Arg1 = 'MN' or Arg1 = 'MO' or Arg1 = 'MS' or Arg1 = 'NE' or Arg1 = 'NH' or Arg1 = 'NJ' or Arg1 = 'NM' or Arg1 = 'NV' or Arg1 = 'OH' or Arg1 = 'OK' or Arg1 = 'OR' or Arg1 = 'PA' or Arg1 = 'PR' or Arg1 = 'IN' or Arg1 = 'TN' or Arg1 = 'TX' or Arg1 = 'UT' or Arg1 = 'VA' or Arg1 = 'WA' or Arg1 = 'WI' or Arg1 = 'WV'or Arg1 = 'DF' or Arg1 = 'MS' or Arg1 = 'NE' or Arg1 = 'NH' or Arg1 = 'NJ' or Arg1 = 'NM' or Arg1 = 'NV' or Arg1 = 'OH' or Arg1 = 'OK' or Arg1 = 'OR'
Then
Ans=1
End
Else
Ans=0
End
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Load them into a hashed file.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Put all of your state codes into a comma separated string, then just use the INDEX BASIC function if you can guarantee that Arg1 will always be 2 characters. For added performance, initialize States variable once and use a labeled COMMON to hold the variable. INDEX with a 1 in the third argument means stop on the first occurrence and return the character position where found. If not found, it returns 0.

Code: Select all

States = "AA,BB,CC,DD,EE,FF"

FOUND=INDEX(States, Arg1,1)
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Or that. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
bapajju
Participant
Posts: 82
Joined: Wed Nov 19, 2003 11:58 pm

Post by bapajju »

Thanks Kenneth. it works. But lets say rather than getting the Character Number of the first occurance I need only 0 and 1. 1 for Found and 0 for not found. Is there any shortcut way to this.
kcbland wrote:Put all of your state codes into a comma separated string, then just use the INDEX BASIC function if you can guarantee that Arg1 will always be 2 characters. For added performance, initialize States variable once and use a labeled COMMON to hold the variable. INDEX with a 1 in the third argument means stop on the first occurrence and return the character position where found. If not found, it returns 0.

Code: Select all

States = "AA,BB,CC,DD,EE,FF"

FOUND=INDEX(States, Arg1,1)
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You've got what you need - a zero is not found and a non-zero is found.
-craig

"You can never have too many knives" -- Logan Nine Fingers
sun rays
Charter Member
Charter Member
Posts: 57
Joined: Wed Jun 08, 2005 3:35 pm
Location: Denver, CO

Post by sun rays »

Use the hash file to look-up, and in the transformer check if the look-up failed or suceeded, by using ISNULL() and assign a zero or 1 based on the result.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

By using COMMON to hold the comma separated list, the list is built once. After that, for short text strings, say under 10K length, the INDEX command is the fastest method. By guaranteeing 2 char values, you don't worry about false hits on a substring match. If the list was large, or the chance of a substring match, then the hash lookup would be a safer and better solution.

As for help on the INDEX command, look in your DS BASIC manual on the start button to learn about it. Craig's correct, non-zero values will be treated as @TRUE if used in a boolean expression.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

In this case all the States are 2 digits and don't contain substrings of each other so the quick INDEX() will work well. If you were looking up a country list which might have ...USA,UK,UKMENISTAN... {doesn't exist, but I couldn't think of a better example} then it would make sense to change the comma delimiter to @FM and use the FIND() function or the LOCATE() function.
bapajju
Participant
Posts: 82
Joined: Wed Nov 19, 2003 11:58 pm

Post by bapajju »

Thanks everyone. It works perfect with INDEX(). Thanks a lot for the prompt help.
ArndW wrote:In this case all the States are 2 digits and don't contain substrings of each other so the quick INDEX() will work well. If you were looking up a country list which might have ...USA,UK,UKMENISTAN... {doesn't exist, but I couldn't think of a better example} then it would make sense to change the comma delimiter to @FM and use the FIND() function or the LOCATE() function.
Post Reply