efficient string manipulation question

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

efficient string manipulation question

Post by qt_ky »

What would be the most efficient way to achieve the following logic in a Parallel job?

A varchar(100) column may contain zero or more commas anywhere in the string.

For each comma, if the previous character is a digit 0-9 and the following character is a digit 0-9, then remove the comma, otherwise replace the comma with a space.

If the column contains zero commas, pass the unchanged column value through. Examples:

Code: Select all

"$321.00 4 5" becomes "$321.00 4 5" (unchanged)
"1,234 miles" becomes "1234 miles"
"Lets eat, grandpa!" becomes "Let's eat  grandpa!"
"Area 5, Sections 6,7,8,,9" becomes "Area 5  Sections 678  9"
",Parts A,B,C," becomes " Parts A B C "
Choose a job you love, and you will never have to work a day in your life. - Confucius
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Sorry, just had to LOL at the classic "Let's eat grandpa!". For want of a comma... :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ShaneMuir
Premium Member
Premium Member
Posts: 508
Joined: Tue Jun 15, 2004 5:00 am
Location: London

Post by ShaneMuir »

Not sure about most efficient, but I can only think of 2 ways:
1. Using some sort of regular expression at the unix command line (eg sed)
2. If using DS, the some combination of stage and loop variables that would determine the number of commas (DCount), hold the current, previous and/or next delimited field, and perform tests on the first or last char of those fields as required, before rebuilding your output stream based on your requirements.
siauchun84
Participant
Posts: 63
Joined: Mon Oct 20, 2008 12:01 am
Location: Malaysia

Post by siauchun84 »

I will suggest you to use the QualityStage to perform that as you can customize the pattern as you wish.
Post Reply