Extracting a Part from the given String

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

rkdatastage
Participant
Posts: 107
Joined: Wed Sep 29, 2004 10:15 am

Extracting a Part from the given String

Post by rkdatastage »

Hi All

I had strucked up with a specefic Problem and need some guidence to resolve it.

I had a String like "BOLT, W/NUT, P/N P422 AAAA,P.,&" and i had to extract a part of string like "P/N P422".

The Problem is that the Part of the string that has to be extract can exist any where in the given string .

My Actual String : BOLT, W/NUT, P/N P422 AAAA,P

My Output : P/N P422

Need Some Kind of Functions in Datastage that will fetch the part of string from the given string.

Your Kind response is Appreciated.

Thanks
RK
rkdatastage
Participant
Posts: 107
Joined: Wed Sep 29, 2004 10:15 am

Post by rkdatastage »

Can AnyBody Share your ideas how to extract part of a string from the given string
balajisr
Charter Member
Charter Member
Posts: 785
Joined: Thu Jul 28, 2005 8:58 am

Post by balajisr »

The Problem is that the Part of the string that has to be extract can exist any where in the given string
You can use Index() function to get the starting position of the substring.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

DataStage has more string functions than you will normally use and doing what you want is quite easy - if you specify the rules. If you are using a "," (comma) to separate component parts of a string then the FIELD() function will let you do that. But you have also stated that your substring can occur in any field. Are you looking for that one particular substring or one that starts with "P/N"?
jhmckeever
Premium Member
Premium Member
Posts: 301
Joined: Thu Jul 14, 2005 10:27 am
Location: Melbourne, Australia
Contact:

Post by jhmckeever »

RK,

If you need to identify/extract more than just this one type of string then a QualityStage standardisation job would allow you to classify and extract tokens from the string regardless of their input format.

In this case, the classification file of your selected ruleset would define identifying tokens like 'P/N', 'BOLT', etc.

JMcK.
<b>John McKeever</b>
Data Migrators
<b><a href="https://www.mettleci.com">MettleCI</a> - DevOps for DataStage</b>
<a href="http://www.datamigrators.com/"><img src="https://www.datamigrators.com/assets/im ... l.png"></a>
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

How do you define "P/N P422" is the required the output. Is it a fixed value for all the set of input srtings. If so you just need to find whether "P/N P422" exist or not. In that case, you can DCount(). If you have another table from which you need to pick up the value you can do lookup and use the function. Or you have start position and end position, you can extract.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
rkdatastage
Participant
Posts: 107
Joined: Wed Sep 29, 2004 10:15 am

Post by rkdatastage »

Hi All

Thanks for your Great Response

1. I had to extract "P/N xxxxxxx" value that has been any where in the Given String

2. I had to Split the Given String into two Parts

Like for Example

Description = "BOLT, W/NUT, P/N P422 WISS,J.,&"

From this I had to extract 2 Strings

a) Part = "P/N P422"
b) Restofdata = "BOLT, W/NUT, WISS,J.,&"

Now I hope my requirement is clear

Pls advise how to write a Routine to solve this
Means Steps i had to Follow

Thanks in Advance
RK
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

As mentioned you can use Index(), which will give you the starting position of "P/N", you can get the substring of required string from the main string by Input.Field[Index(Input.field,"P/N"),9].
And for the second one, you can get the substring from the first char to the point where you find "P/N" and concatinate with the string which starts from 9 char after the the existence of "P/N" to the end.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
rkdatastage
Participant
Posts: 107
Joined: Wed Sep 29, 2004 10:15 am

Post by rkdatastage »

Hi All

Thanks for your response

But the problem is i am not sure the length of P/N . It may range from 0 to 20 digits .

As the String Contains the Data as "P/N" as last characters of whole Description then the Length of "P/N" is 0. And this P/N can Lies anywhere in the Given Description.

What are the Possible Ways to Attack this Problem.

Thanks in Advance
RK
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Spend a little time thinking about it. :?

What marks the end of this part number? You'll need to find the position of 'P/N' in the field. Four positions from there the actual part number starts. It ends... where? The next space you encounter? The end of the string? A comma? All of the above?

You're going to need to write a fairly sophisticated string parsing routine to do this correctly 100% of the time. You've got all the tools you need in the various DataStage string functions. They're all noted in the Help. All that's left is for you to put it all together.

And rather than just start coding, write it out in words first. Once you think you've got it all nailed down, then turn it into code.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

My choice would be MatchField().

Code: Select all

"P/N " : MatchField(InLink.Description,"0X'P/N '0X' '0X",3)
The MatchField() function here returns everything between "P/N " (with trailing space) and the next space character. It will return "" if "P/N " does not occur in the string, or if the part number is at the very end (no trailing space). You can get around the latter with a pair of MatchField functions.

Etiquette Note
Pushing back after less than 30 minutes because no-one had replied was, frankly, offensive. This is an all-volunteer site, and some of us are in different time zones to you. And we even think about our responses, which can take time.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Well, a few suggestions in writing the routine.
Read in the entire description and take four positions at a time and check for "P/N ".
If it doesnt match, store it in one variable which will be called your 'Restofdata' and if it matches, store it another variable untill you hit ,
Something like

Code: Select all

Desc=TRIM(Arg1)
RestOfData=''
Part=''
EndForLoop=LEN(Desc)
FOR n = 1 TO EndForLoop
     IF Desc[n,4] = 'P/N '
     THEN
          Part=Field(Desc[n,EndForLoop-n+1],1,2)
          RestOfData=Desc[1,n-1]:Desc[n+Len(Part),EndForLoop-Len(Part)]
     END
NEXT n
I am sure this is not the best logic, but off the top of my head, this is what i could come up with.
'Part' will contain the part you want and 'RestOfData' is the rest of the data. You can concatenate the two together with either a pipe or some other character that will not showup in your Desc data and return that back.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

I revisited my routine above and found some minor logical errors in it. I fixed it. You can use it now. Try to test it for different scenarios. I, currently do not have access to Datastage and hence cannot test it myself.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
rkdatastage
Participant
Posts: 107
Joined: Wed Sep 29, 2004 10:15 am

Post by rkdatastage »

Hi All

I had executed the Code But it is not Working

I tried to Change it But didn't Succeeded

Can you Resend the Code

Thanks
RK
ameyvaidya
Charter Member
Charter Member
Posts: 166
Joined: Wed Mar 16, 2005 6:52 am
Location: Mumbai, India

Post by ameyvaidya »

Hi rkdatastage .
Try this one.
You may have to modify the If statement to suit your data.

Code: Select all



InString=Arg1

StringLength=len(InString)

* Get the Starting Location for Part Number
PnStart=INDEX(InString,"P/N",1)
* Init End of Part Number for testing.
PnEnd=StringLength

* To test for Endo of the Part Number. Part number begins at the 5th position from the "P" of "P/N"
FOR PnEndTester=(PnStart+5) TO StringLength

    * This is the check to find the end of Part Number. Change this to suit your part number Valid Characters
    IF NUM(InString[PnEndTester,1]) AND InString[PnEndTester,1]<>" " THEN

        * Left in for Debugging
        PRINT  "PNENDTESTERIf ":PnEndTester: ":" :InString[PnEndTester,1]
    END
    ELSE

        * This char is not a valid part number character
        PnEnd=PnEndTester-1

        * Left in for Debugging
        PRINT  "PNENDTESTERElse ":PnEndTester: ":" :InString[PnEndTester,1]: ":" :PnEnd

        * We have the end position we want. no need to continue.
        GOTO ExitFor
    END
NEXT PnEndTester

ExitFor:
* Get the Part number String
PNString=InString[PnStart,PnEnd-PnStart+1]

* Get the Remaining String
RemString=InString[1,PnStart-1]:InString[StringLength-PnEnd]
Ans=PNString:@FM:RemString

Amey Vaidya<i>
I am rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand.</i>
<i>- Douglas Adams</i>
Post Reply