Page 1 of 1

Full Name

Posted: Sun Oct 08, 2006 12:37 pm
by edward_m
ALL,
From my source i am getting full name (this includes first name,last name,suffix,prefix and title).My requirement is to split the full name and pass it to target with first name,last name,suffix,prefix and title.
For example full name is JOHN E EDWARD Jr. then
target first name --JOHN
last Name EDWARD
Suffix E
Prefix space
Title Jr.

Please suggest how to achieve the above using DS functions.

THANKS IN ADVANCE.

Posted: Sun Oct 08, 2006 1:25 pm
by ArndW
There are large (and pricey) software applications that do this type of name matching and cleansing that are usually more accurate than home grown solutions.

First you need to get your logic down, then you can write some DS Basic code or functions to do this for you.

1. Parse out all variations of Mr., Mrs., Miss, Dr., Prof., etc.
2. You will now have a string with 1, 2 or more spaces.
3. If 1 space then word 1 = first name, word 2 = last name
4. If 2 space then word 1 = first name, word 2 = middle name/initials, word 3 = last name
5. If more than 2 spaces... what do you do?

Should your logic be something like above then doing the actual parsing in DS/Basic is a matter of a couple of lines. If your incoming string could contain a name in the form "Dr. A. E. Neumann, Jr.", or "Neumann, Alfred E." then you need to rethink your algorithm.

Posted: Sun Oct 08, 2006 1:48 pm
by ray.wurlod
With a consistent format it's an easy task, mainly using the Field() - and possibly Count() - functions.

In real life names come in lots of different formats. The best tool to use is QualityStage name standardization, which "buckets" the various name components (for example title, main name, first name, generation (e.g. Junior), and so on. This can be called from DataStage through the QualityStage plug-in.

Posted: Mon Oct 09, 2006 7:24 am
by edward_m
its in consistent format,please throw some sample code.


Thanks..

Posted: Mon Oct 09, 2006 11:54 am
by ArndW
Are there always the same number of fields? If so, then FIELD(In.String,' ',1) = fname, FIELD(In.String,' ',2) = middlei, FIELD(In.String,' ',3) = lname, FIELD(In.String,' ',4) = Suffix.

Posted: Mon Oct 09, 2006 1:00 pm
by edward_m
Are there always the same number of fields
Yes..But the spaces between words are not same..sometimes its more than one space between the words in FULL NAME.
First i have to format the full name with one space between the words then i need to pass them to last,first and so on.

Any idea how to format a name with only space between the words.

for example FULL NAME is 'EDWARD R CLINTON ' i want to format this to
'EDWARD R CLINTON' then use the field funtion to extract first ,last name and so..

Posted: Mon Oct 09, 2006 1:08 pm
by thumsup9
Trim function with option D

Removes leading and trailing spaces and tabs, and reduces multiple spaces and tabs to single ones.

Posted: Mon Oct 09, 2006 2:31 pm
by ArndW
Ok, then either use a stage variable ShortString = TRIM(IN.String) or do as before:

FIELD(Trim(In.String),' ',1) = fname, FIELD(Trim(In.String),' ',2) = middlei, FIELD(Trim(In.String),' ',3) = lname, FIELD(Trim(In.String),' ',4) = Suffix

Posted: Mon Oct 09, 2006 2:53 pm
by edward_m
It solved my requirement.
Thanks a lot for all replies.