split city and state data from one filed into two fields

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ashok
Participant
Posts: 43
Joined: Tue Jun 22, 2004 3:04 pm

split city and state data from one filed into two fields

Post by ashok »

I want to split city and state in to two fields from one field as in below example:

field: ADDRESS

MAYVILLE WI
SPOKANE WA
FREEBURG IL
ST. LOUIS, MO
MILFORD C.T
FREEBURG, I.L


Results should look like this:

fileds: CITY | STATE

MAYVILLE | WI
SPOKANE | WA
FREEBURG | IL
ST.LOUIS | MO
MILDORD | CT
FREEBURG | IL

thank you
Ashok
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

I noticed in another thread that you have QualityStage installed. Why not run these fields through standard address parsing through a QualityStage plugin? It should be able to seperate the city from the state and deliver them back in seperate fields.

If you were doing it in DataStage you need to make certain assumptions. Are city and state always seperated by a space? Does the state always appear at the end of the string? Is state always a two character code?
In a routine I would split off the last two characters after the last space using the DCOUNT and FIELD commands. I would then remove any non alpha charcters from this string. I would then pass this to a state code lookup, either by hard coding state codes in a routine or doing a transformer lookup against a hash file. Anything that does not produce a valid state code gets rejected.
The remainder of the string becomes the city. Trim off from the end of the city name any non alpha characters (eg. ST.LOUIS, becomes ST.LOUIS).
dsxdev
Participant
Posts: 92
Joined: Mon Sep 20, 2004 8:37 am

Post by dsxdev »

A simple solution would be use a Column Export Stage which can export/ split a single column into multiple columns.
But the column to be split should have a format( fixed with or delimited).
Happy DataStaging
PilotBaha
Premium Member
Premium Member
Posts: 202
Joined: Mon Jan 12, 2004 8:05 pm

Post by PilotBaha »

One word : QualityStage.

If you, as an alternative, try to "fish" <blank> + state code, what will you do if you have "Hartford CT" and "Hartford C.T" and "HArtford, CT" etc. etc. No easy way out..

Use QS, and sleep better at night :lol: :D
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If you really want to do it in DataStage you can, though I'd fully support your using Quality Stage - after all, "bucketing" address components is one of the things it does supremely well. For those who don't have Quality Stage...

A "final delimited field" function is handy thing to have in one's tool kit. Here's a cut down version (with no error reporting).

Code: Select all

FUNCTION FinalDelimitedField(String, Delimiter)

* Author:  Ray Wurlod

If UnAssigned(String) Or IsNull(String) Or UnAssigned(Delimiter) Or IsNull(Delimiter)
Then
   Ans = @NULL
End
Else
   If Len(Delimiter) = 1
   Then
      FieldCount = DCount(String, Delimiter)
      Ans = Field(String, Delimiter, FieldCount, 1)
   End
   Else
      Ans = @NULL
   End
End

RETURN(Ans)
State could be derived, then, as FinalDelimitedString(Trim(Address)," ")

The remainder of the address (city) could be handled with something similar, to return all but the final delimited field.

Code: Select all

Field(TrimB(Address), " ", 1, Count(TrimB(Address), " "))
The TrimB function is insurance against the Address field containing trailing spaces.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

And somewhere in there, either at the start or at the end, remove the comma delimiters from the city name.

Code: Select all

Address = EREPLACE(Address, ",", "")
ashok
Participant
Posts: 43
Joined: Tue Jun 22, 2004 3:04 pm

Post by ashok »

thank you for help,

Can any one help me how to do this job in Quality Satage, as I tried using USAREA rule to seperate City and Sate it worked only on one record and passing remaining records as it is to result file,
rasi
Participant
Posts: 464
Joined: Fri Oct 25, 2002 1:33 am
Location: Australia, Sydney

Post by rasi »

Ashok

There is a separate forum for Quality Stage and post your questions there for quick reply. Not all knows Quality Stage in this forum to help you.

Thanks
Regards
Siva

Listening to the Learned

"The most precious wealth is the wealth acquired by the ear Indeed, of all wealth that wealth is the crown." - Thirukural By Thiruvalluvar
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Not all knows Quality Stage in this forum to help you.
Why not?!!!!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply