String extract

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
subrat
Premium Member
Premium Member
Posts: 77
Joined: Tue Dec 11, 2007 5:54 am
Location: UK

String extract

Post by subrat »

Hi

Anyone have any thought on this?

I want to retrieve a text from a column on below logic.

Need to retrive the text start at position 2 characters after keyword "ABC=" till 5 byte or untill a space encountered

e.g:

SourceCol
AHHHNSHHBDHHHSSSUSHUSUUSABC=TESTTTHHSHST output should be - STTTH
AHHHNSHHBDHHHSSSUSHUSUUSABC=TESTT THHSHS output should be - STT

Thanks
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

If your data is in a text file then put the following in the filter command and let awk take care of the parsing. Thats the easiest way I can see. Otherwise you will have to write a C routine and loop through each character.

Code: Select all

awk '{print substr($0,index($0,"ABC=")+6,5)}' | awk -F" " '{print $1}'
Actually there is more to the code above, you need to tell me what column it is in the source file and how is it delimited and then I can give you the exact logic. But the meat is what I have posted above
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Code: Select all

svABCpos <--- Index(InLink.TheString, "ABC", 1)

Field(InLink.TheString[ABCpos+2,5], " ", 1, 1)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Ray, you trumped me. A very good solution. I wonder why I did not think of that.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
subrat
Premium Member
Premium Member
Posts: 77
Joined: Tue Dec 11, 2007 5:54 am
Location: UK

Post by subrat »

ray.wurlod wrote:

Code: Select all

svABCpos <--- Index(InLink.TheString, "ABC", 1)

Field(InLink.TheString[ABCpos+2,5], " ", 1, 1)
I think it would return =TEST from first example as index will keep the first position. Is there any function to return last character of substring in stage variable?


Else we need to add number of search characters in each column derivation
Instead of +2 it would be +5
Field(DSLink3.COL[StageVar+5,5], " ", 1, 1)



Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

:!: Esoteric Usage Alert :!:

YourField[1] will return 1 character from the end of the string.


:wink:
Last edited by chulett on Fri Feb 11, 2011 2:31 pm, edited 1 time in total.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Code: Select all

Right(StageVar,1)
will also return the final character. I prefer this approach as it's more "self-documenting" than the esoteric use of the substring operator.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

subrat wrote:I think it would return =TEST from first example as index will keep the first position. Is there any function to return last character of substring in stage variables
There's nothing in the original specification that asserts that "TEST" will always follow "ABC=".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
subrat
Premium Member
Premium Member
Posts: 77
Joined: Tue Dec 11, 2007 5:54 am
Location: UK

Post by subrat »

ray.wurlod wrote:
subrat wrote:I think it would return =TEST from first example as index will keep the first position. Is there any function to return last character of substring in stage variables
There's nothing in the original specification that asserts that "TEST" will always follow "ABC=".
Ray, the first solution you have given would work fine, but in that case in we need to add a count till we get the last position of that text and then when we can add '2' again which was my requirement.

My only question now is that is there any other function which can return me directly index position of last character in substring (ABC=) in this case.

Hope i make sense,

Thanks
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Length()

If your string is 5 bytes in length, what will the position of the last byte be? A little math is all you need at that point.....

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

jwiles wrote:If your string is 5 bytes in length, what will the position of the last byte be?
Ooo Ooo, I know the answer, I know the answer :!:
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

And just to head Ray off at the pass, no it's not 42.
-craig

"You can never have too many knives" -- Logan Nine Fingers
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

But only for extremely high values of 42
- james wiles


All generalizations are false, including this one - Mark Twain.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I suspect that subrat needs the offset in the original string, but that's just a combination of the Index() function already mentioned and Len() function. The results of each are integers so can be added.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply