Splitting Single column value to three columns
Moderators: chulett, rschirm, roy
Splitting Single column value to three columns
Hi I have a requirement like below
I have source data coming as
Column A
BOISE ID 11022
SUN VALLEY ID 12345
I need these single column value in to three different columns with City,State and ZIP columns. I tired with field function but if you see second record even city has space in between. Can you guys please help me and that would be great help.
Thanks in advance for your valuble time and suggestions.
Thanks,
Kumar.
I have source data coming as
Column A
BOISE ID 11022
SUN VALLEY ID 12345
I need these single column value in to three different columns with City,State and ZIP columns. I tired with field function but if you see second record even city has space in between. Can you guys please help me and that would be great help.
Thanks in advance for your valuble time and suggestions.
Thanks,
Kumar.
I don't see how that would help, you still have to declare a delimiter, and space would still get you the issue noted in the first post. Or am I missing something?
I think you'll need to do some intelligent parsing - find the number of fields in the string by counting the delimiters. Pull off the last field as the zip, then the next-to-last as the state. Then everything else woould be the city. Assuming state and zip are always present, of course. If not, then it gets a little more... interesting.
This would be pretty easy in a Server routine.![Wink :wink:](./images/smilies/icon_wink.gif)
I think you'll need to do some intelligent parsing - find the number of fields in the string by counting the delimiters. Pull off the last field as the zip, then the next-to-last as the state. Then everything else woould be the city. Assuming state and zip are always present, of course. If not, then it gets a little more... interesting.
This would be pretty easy in a Server routine.
![Wink :wink:](./images/smilies/icon_wink.gif)
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Premium Member
- Posts: 1735
- Joined: Thu Mar 01, 2007 5:44 am
- Location: Troy, MI
Hopefully the state and Zip code are not two-worded. In that case you can get last two fields state & Zip and when you have two other fields you can derive the first field using length. hence the derivation will be:
every thing with sv is a stage variable. And First 2 must be executed before the third one or if not using the stage variables the derivation of stage variable needs to be substituted where the stage variable is used.
Code: Select all
ZIP Code: svZIP=field(input column,' ',Dcount(Input Column))
State: svState=field(input column,' ',Dcount(Input Column)-1)
City: svCity=input column[1,len(input column)-(len(svZIP)+len(svState)+2)]
Last edited by priyadarshikunal on Thu Jun 18, 2009 10:24 am, edited 1 time in total.
Priyadarshi Kunal
Genius may have its limitations, but stupidity is not thus handicapped.![Wink :wink:](./images/smilies/icon_wink.gif)
Genius may have its limitations, but stupidity is not thus handicapped.
![Wink :wink:](./images/smilies/icon_wink.gif)
To begin with - you'll probably need to do the steps below in stage variables in a transformer.
First - you'll need to use Trim on the field to get rid of any leading or trailing spaces and to reduce any interior spaces to one (ie: changes "City_____ST" to "City_ST"). Leading, trailing or duplicate interior spaces will throw you off. Use the trimmed value for all your substrings below.
Then use the Count function to determine how many spaces are in the string. If it is less than two or null then you need to reject the record as not having all entries.
Then use the field function three times, once to get the last space-separated field (zip), once to get the next to last field (city) and the third time to get the city.
To get the city it is a bit tricky because you'll have to do some math and use the optional last argument (number of substrings) of the Field function to get all but the last two "fields":
First - you'll need to use Trim on the field to get rid of any leading or trailing spaces and to reduce any interior spaces to one (ie: changes "City_____ST" to "City_ST"). Leading, trailing or duplicate interior spaces will throw you off. Use the trimmed value for all your substrings below.
Then use the Count function to determine how many spaces are in the string. If it is less than two or null then you need to reject the record as not having all entries.
Then use the field function three times, once to get the last space-separated field (zip), once to get the next to last field (city) and the third time to get the city.
To get the city it is a bit tricky because you'll have to do some math and use the optional last argument (number of substrings) of the Field function to get all but the last two "fields":
Code: Select all
FIELD (string, delimiter, occurrence [ ,num.substr] )
Last edited by asorrell on Thu Jun 18, 2009 10:25 am, edited 1 time in total.
-
- Premium Member
- Posts: 425
- Joined: Sat Nov 19, 2005 9:26 am
- Location: New York City
- Contact:
-
- Premium Member
- Posts: 1735
- Joined: Thu Mar 01, 2007 5:44 am
- Location: Troy, MI
correct.First - you'll need to use Trim on the field to get rid of any leading or trailing spaces and to reduce any interior spaces to one (ie: changes "City_____ST" to "City_ST"). Leading, trailing or duplicate interior spaces will throw you off. Use the trimmed value for all your substrings below.
Then use the Count function to determine how many spaces are in the string. If it is less than two or null then you need to reject the record as not having all entries.
you can use field function for city also
Code: Select all
svCity=Field(input column,' ',1,Dcount(input column)-2)
I forgot the mention preprocessing steps noted by asorell.
![Idea :idea:](./images/smilies/icon_idea.gif)
Priyadarshi Kunal
Genius may have its limitations, but stupidity is not thus handicapped.![Wink :wink:](./images/smilies/icon_wink.gif)
Genius may have its limitations, but stupidity is not thus handicapped.
![Wink :wink:](./images/smilies/icon_wink.gif)
Thanks for all your help but ,
i have used
Field(Trim(DSLink37.DDAM_SEASONAL_CITY_STATE_ZIP),' ',Dcount(DSLink37.DDAM_SEASONAL_CITY_STATE_ZIP)) this function toget zip but it is showing in RED Then i use Field(Trim(DSLink37.DDAM_SEASONAL_CITY_STATE_ZIP),' ',Dcount(DSLink37.DDAM_SEASONAL_CITY_STATE_ZIP,' ')) i am not geting any results on the target.
i have used
Field(Trim(DSLink37.DDAM_SEASONAL_CITY_STATE_ZIP),' ',Dcount(DSLink37.DDAM_SEASONAL_CITY_STATE_ZIP)) this function toget zip but it is showing in RED Then i use Field(Trim(DSLink37.DDAM_SEASONAL_CITY_STATE_ZIP),' ',Dcount(DSLink37.DDAM_SEASONAL_CITY_STATE_ZIP,' ')) i am not geting any results on the target.
Last edited by kumar3846 on Thu Jun 18, 2009 3:26 pm, edited 1 time in total.
Thanks a lot guys and thanks for valuble suggestions
It worked same as priyadarshikunal said in his reply here you go
svZIP=Field(Trim(inputcolumn)," ",Dcount(Trim(inputcolumn)," "))
svState=Field(Trim(inputcolumn)," ",Dcount(Trim(inputcolumn)," ")-1)
inputcolumn[1,len(inputcolumn)-(len(svZIP)+len(svState)+2)]
Thanks
Kumar
It worked same as priyadarshikunal said in his reply here you go
svZIP=Field(Trim(inputcolumn)," ",Dcount(Trim(inputcolumn)," "))
svState=Field(Trim(inputcolumn)," ",Dcount(Trim(inputcolumn)," ")-1)
inputcolumn[1,len(inputcolumn)-(len(svZIP)+len(svState)+2)]
Thanks
Kumar
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Ray of course beat me to it and I am no expert and can only point you in the right direction but this is a perfect use of Quality Stage address standardization.
QS will look for certain patterns in your data and then those patterns are used to "parse" the data into fields. It will recognize multi-part or multi worded street and city names and most often handle things correctly. And, if I am not mistaken, will reject records for further manual review if they don't match a standard/recognized pattern.
And be careful with Canadian postal codes (zip code). They can often have a space between 2 sets of 3 chars (e.g. "M4B 1B3")
QS will look for certain patterns in your data and then those patterns are used to "parse" the data into fields. It will recognize multi-part or multi worded street and city names and most often handle things correctly. And, if I am not mistaken, will reject records for further manual review if they don't match a standard/recognized pattern.
And be careful with Canadian postal codes (zip code). They can often have a space between 2 sets of 3 chars (e.g. "M4B 1B3")
Bob
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Kind of hoping that the people who need it buy it (insert usual disclaimers about not getting finder's fee etc. here).
Sure it can be done with generic routines such as RightmostDelimitedSubstring (this one does not come out of the box), but it's tedious identifying all possible variants on the address and area. The creators of QualityStage have already done all that hard work, for lots of different geographies.
Sure it can be done with generic routines such as RightmostDelimitedSubstring (this one does not come out of the box), but it's tedious identifying all possible variants on the address and area. The creators of QualityStage have already done all that hard work, for lots of different geographies.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.