string splitting

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

suresh_dsx
Participant
Posts: 160
Joined: Tue May 02, 2006 7:49 am

string splitting

Post by suresh_dsx »

Hi,
I have a column name is Full name
i want to split the full name into three columns
First name,middlename,last name.

Here is the data,
john bbb colian
sam donald

1. first row is contains the first name ,middlename,last name.
2.second row is first name,lastname

my source data is sequential file.


i tried with substring. it is not reading the value.
please give some suggestions
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

There are several ways, but what is the logic on the second record?
Either "donald " goes to middle name or last name??
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
suresh_dsx
Participant
Posts: 160
Joined: Tue May 02, 2006 7:49 am

Post by suresh_dsx »

john bbb colian
sam donald


only thing is we need to read the data. we already validating the data.

sam is first name
donald is last name


where ever we have the middle name...we need to dispay the name
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

In transformer,

Code: Select all

Stage variable : vCount = Dcount(Input.Field,' ') 
Use the following derivation for each column.

Code: Select all

Firstname : Field(Input.field,' ',1) 
MiddleName : If vCount > 1 Then Field(Input.field,' ',2)  Else ''
LastName : If vCount > 1 Then Field(Input.field,' ',3)  Else Field(Input.field,' ',2)
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

On a point of efficiency, you've calculated Field(Input.Field, " ", 2) twice. It would therefore be better to calculate this in a stage variable. Once.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Yup... it will be certainly efficient.
I had a different approach when started, but ended with this.

Suresh, try using this

Stage variable :

Code: Select all

vCount = Dcount(Input.Field,' ')
vField2 =  Field(Input.field,' ',2)
Use the following derivation for each column.

Code: Select all

Firstname : Field(Input.field,' ',1)
MiddleName : If vCount > 1 Then vField2 Else ''
LastName : If vCount > 1 Then Field(Input.field,' ',3)  Else vField2
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
travissolt
Participant
Posts: 51
Joined: Mon Feb 05, 2007 1:53 pm

Post by travissolt »

What if the data looked like the below?


colian,john bbb
donald,sam

1. first row contains the last name,first name' 'middle_initial
2.second row is lastname,firstname' 'middle_initial


I can grab the last name easy enough but is there away to tell it to grab whats to the right of the first comma it sees for the first name? I don't see away around having the firstname and middle_initial winding up in the firstname field but can that work.

Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The Field function will do that for you automatically. It will only see two comma delimited values, so the space won't bother it. You could then use Field again on the 'first middle' value with a space as the delimiter to split it again... if only it was as simple as that. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

Stage Var:

Code: Select all

sVlast = Field(Input.field,',',1)
sVfirst = Field(Input.field,',',2)
sVcount = Dcount(sVfirst,' ')
Column Derivations:

Code: Select all

Firstname : If sVcount = 1 then Field(sVfirst,' ',1) else sVfirst
MiddleName : If sVcount=1 then Field(sVfirst,' ',2) Else '' 
LastName : sVlast
IHTH
I haven't failed, I've found 10,000 ways that don't work.
Thomas Alva Edison(1847-1931)
travissolt
Participant
Posts: 51
Joined: Mon Feb 05, 2007 1:53 pm

Post by travissolt »

I actually understand that and wish I could have figured it out. Thanks for the help.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

But as I said, it's never as simple as that. :(

Smith, Bobby Jo

Do you really want 'Jo' as the middle name or initial?
-craig

"You can never have too many knives" -- Logan Nine Fingers
travissolt
Participant
Posts: 51
Joined: Mon Feb 05, 2007 1:53 pm

Post by travissolt »

Luckily these are unedited fields so the best we can do is get a lastname and a firstname. I am surprised by some of the things they have in there :)
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

"Things" Like what :?
I haven't failed, I've found 10,000 ways that don't work.
Thomas Alva Edison(1847-1931)
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You probably don't want to know. :wink:

If it's like most systems I've seen, it could be - literally - anything.
-craig

"You can never have too many knives" -- Logan Nine Fingers
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

"Anything" makes more sense 8) . Now i really don't want to know anything.
I haven't failed, I've found 10,000 ways that don't work.
Thomas Alva Edison(1847-1931)
Post Reply