Count number of fields in ~~ delimited string

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
JeroenDmt
Premium Member
Premium Member
Posts: 107
Joined: Wed Oct 26, 2005 7:36 am

Count number of fields in ~~ delimited string

Post by JeroenDmt »

I want to count the number of fields in a ~~ delimited string.
For example if the string is "field 1~~field2~~field3' the result should be 3.

I thought I could do this using the Count function (and add 1) or the DCount function.

Though when the string is 'field1~~~~field3', I'd like the result to be 3 as well, but in this case Count(string,'~~')=3 and DCount(string,'~~')=4, because there is 3 ways it can find a ~~ in the string.

Is there a way I can get the right count of the ~~ delimiters in Datastage?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Count single occurances, divide by 2 and then add 1.
-craig

"You can never have too many knives" -- Logan Nine Fingers
JeroenDmt
Premium Member
Premium Member
Posts: 107
Joined: Wed Oct 26, 2005 7:36 am

Post by JeroenDmt »

chulett wrote:Count single occurances, divide by 2 and then add 1.
Smart.

Is there also a way to count the 'real' ~~ delimiters in case a ~ could occur in a string?
So that when the string is 'field~1~~field~2~~field~3' the result is still 3?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Would that be valid in your world? Not sure off the top of my head how to avoid the problem you've seen when checking for multiple-character delimiters. :? Pondering...
-craig

"You can never have too many knives" -- Logan Nine Fingers
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

In the filter command put in the following awk command.

Code: Select all

awk -F"\~~" '{print NF}'

This should take care of it.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
JeroenDmt
Premium Member
Premium Member
Posts: 107
Joined: Wed Oct 26, 2005 7:36 am

Post by JeroenDmt »

DSguru2B wrote:In the filter command put in the following awk command.

Code: Select all

awk -F"\~~" '{print NF}'

This should take care of it.
Thanks a lot. That does take care of it indeed.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Still curious how one would do this without leveraging 'outside' components...
-craig

"You can never have too many knives" -- Logan Nine Fingers
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Well, it can be done using EREPLACE() and count together. Use EREPELACE() to change '~~' to something that is most likely not going to show up in the data. Something like ! or & and do a count on that. In px, pxEreplace() can be used to replace the double tilda.
NOTE: pxEreplace() is not an IBM product. Its user written and can be found in FAQ Discussion forum.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Post Reply