DSXchange

Posted: **Wed Nov 11, 2015 6:51 am**

What would be the most efficient way to achieve the following logic in a Parallel job?

A varchar(100) column may contain zero or more commas anywhere in the string.

For each comma, if the previous character is a digit 0-9 and the following character is a digit 0-9, then remove the comma, otherwise replace the comma with a space.

If the column contains zero commas, pass the unchanged column value through. Examples:

Code: Select all

"$321.00 4 5" becomes "$321.00 4 5" (unchanged)
"1,234 miles" becomes "1234 miles"
"Lets eat, grandpa!" becomes "Let's eat  grandpa!"
"Area 5, Sections 6,7,8,,9" becomes "Area 5  Sections 678  9"
",Parts A,B,C," becomes " Parts A B C "

Posted: **Wed Nov 11, 2015 8:34 am**

Sorry, just had to LOL at the classic "Let's eat grandpa!". For want of a comma...

Posted: **Tue Nov 17, 2015 8:11 am**

Not sure about most efficient, but I can only think of 2 ways:
1. Using some sort of regular expression at the unix command line (eg sed)
2. If using DS, the some combination of stage and loop variables that would determine the number of commas (DCount), hold the current, previous and/or next delimited field, and perform tests on the first or last char of those fields as required, before rebuilding your output stream based on your requirements.

Posted: **Wed Dec 02, 2015 12:13 am**

I will suggest you to use the QualityStage to perform that as you can customize the pattern as you wish.

DSXchange

efficient string manipulation question

efficient string manipulation question