Parsing Names

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
SUMAN CHILUVURI
Premium Member
Premium Member
Posts: 19
Joined: Fri Oct 15, 2010 1:26 pm

Parsing Names

Post by SUMAN CHILUVURI »

I need to split this name into 3 fields even if there is no middle name

Names
Broyles,Keith T
Hammond,Edward J
Bullock,Monica
Poston Jr.,Hugh O

Result

Code: Select all

firstname   lastname  middlename
Broyles     Keith     T
Bullock     Monica
I am facing difficulty if the name does not have a middle name
can any one please give me the syntax.

datastage or sql query any thing is ok.

Moderator: subject changed
suman Datastage Developer
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Please choose more meaningful subjects for your posts, one that is actually related to the question being asked.

As to your issue, spell out in words - regardless of the tool used - the process of parsing the name and rules for determining if a middle name is present or not. Once you do, how you would attempt to do this in DataStage should be much clearer.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Does the comma always appear?

How will you handle:
WILLIAMS, Mary-Jane Fiona
WILLIAMS, Mary Jane Fiona
WILLIAMS Mary-Jane
DE WORDE, William
DE WORDE William
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

ray.wurlod wrote:Does the comma always appear?
I certainly hope so. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It's important to know. Otherwise, what is the family name of DENNIS JAMES or JAMES DENNIS ?

Is there a business rule that explains how to handle names in which the comma is not present?

What if there are two commas? GATES, III, WILLIAM F

What if there are three commas? SCHULTZ, JR, Charles, Admiral

BUSINESS RULES!!!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
SUMAN CHILUVURI
Premium Member
Premium Member
Posts: 19
Joined: Fri Oct 15, 2010 1:26 pm

Post by SUMAN CHILUVURI »

Comma always appears after the last name.
suman Datastage Developer
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

This is a delimiter problem, and the root cause is the rules (or lack of them) at the data source.

If the main record delimiter is a comma, parse into last-name and first-middle. If there will always be a single space between the first name and middle name (full or initial), reparse first-middle into first-name and middle-name with space as the delimiter.

You main problem is that you cannot code for humans who are allowed to change their minds about how they enter data. With valid variations in names like compound names with a space in between, you see the problem. It can be solved only by imposing requirements at the data entry point and refusing to accept exceptions.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

... or using something like QualityStage to figure it out based on known first names. But even that's not 100%. Can't be, because parents go on inventing new names for their children.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply