Extract data from a Database Field

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Dsnew
Participant
Posts: 90
Joined: Tue Apr 19, 2005 2:08 pm

Extract data from a Database Field

Post by Dsnew »

Hi all,

I have a requirement to extract information from a database field which can have sample values as below

Code: Select all

Austin+Texas;Blue+Sky;air-baloon+ride;Laptop+Pad;Grass+Green;Deep+;
air-baloon+ride;Austin+Texas;Blue+Sky;Laptop+Pad;Grass+Green;Deep+;
Table+Chair;Blue+Sky;air-baloon+ride;Laptop+Pad;Grass+Green;Deep+;Austin+Texas;
Blue+Sky;air-baloon+ride;Austin+Texas;Laptop+Pad;Grass+Green;Deep+;
Laptop+Pad;Grass+Green;Deep+;Austin+Texas;Blue+Sky;air-baloon+ride;Cool+Cat;
Basically the logic is that the related data (Paired) is seperated by semi colon
The pair is seperated by a "+" sign
Sometimes the second part of the so called Paired data could be missing.
The order of the pairs is not fixed.


I was envisioning either a Oracle function or Datastage Routine, which can search in the string for the first part and give me the second part of the pair (First part could be passed as an input parameter)
Please guide
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Consider a server job with a server routine. DataStage BASIC has a wonderful richness of string-handling functions, because everything it does it does on strings.

Code: Select all

FUNCTION ReturnSecondOfPair(aString, aSearchString)

* Convert string into dynamic array.
String = Convert(";", @FM, aString)

* Find search string within string
Find aSearchString In String Setting FMC,VMC,SMC
Then
   Ans = Field(String<FMC>, "+", 2, 1)
End
Else
   Ans = ""
End

RETURN(Ans)
Error handling (such as null argument handling) has been omitted for clarity.
Last edited by ray.wurlod on Thu Mar 04, 2010 2:35 am, edited 1 time in total.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Dsnew
Participant
Posts: 90
Joined: Tue Apr 19, 2005 2:08 pm

Post by Dsnew »

Ray,

Thank you for the quick response.

Code: Select all

TEST #1
*******
 
Arg1 = Austin+Texas;Blue+Sky;air-baloon+ride;Laptop+Pad;Grass+Green;Deep+; 
Arg2 = Blue
 
Test completed.
 

Result = 
The expected value is Sky, but I get a blank result. Appreciate your valuble inputs on this.
vkhandel
Participant
Posts: 35
Joined: Wed Oct 04, 2006 12:12 am
Location: Pune

Post by vkhandel »

The logic could be implemented in a transformer stage using stage variables as explained below -

Suppose the input column name is "A" storing the entire string, i.e., "Austin+Texas;Blue+Sky;air-baloon+ride;Laptop+Pad;Grass+Green;Deep+; "

Stage variable: SVarIndexString
Derivation: Index(A,'Laptop',1)
Here "Laptop" is the Arg2 (passed as parameter)

Stage variable: SVarNewString
Derivation: A[SVarIndexString,Len(A) - SVarIndexString]

Stage variable: SVarIndexPairSeparator
Derivation: Index(SVarNewString,'+',1)

Stage variable: SVarIndexValueSeparator
Derivation: Index(SVarNewString,';',1)

And finally in the Derivation Tab, we can fetch the "other part" of the string by using the derivation -
OtherPart = SVarNewString[SVarIndexPairSeparator+1,SVarIndexValueSeparator - SVarIndexPairSeparator]

The output would be "Pad".
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Dsnew wrote:Ray,

Thank you for the quick response.

Code: Select all

TEST #1
*******
 
Arg1 = Austin+Texas;Blue+Sky;air-baloon+ride;Laptop+Pad;Grass+Green;Deep+; 
Arg2 = Blue
 
Test completed.
 

Result = 
The expected value is Sky, but I get a blank result. Appreciate your valuble inputs on this.
Please confirm that you used 2 as the third argument of the Field() function.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

My error. The Find statement should be FindStr.

I just tested it and it works OK.

Obviously there will be some cases where it won't work, where the search string occurs earlier in the overall string than where it's the first element. But none of your examples had that situation.

The solution provided by vkhandel suffers from the identical limitation.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Dsnew
Participant
Posts: 90
Joined: Tue Apr 19, 2005 2:08 pm

Post by Dsnew »

Ray,
I used your code (latest). I get similar results :(

Code: Select all

TEST #1
*******
 
Arg1 = Austin+Texas;Blue+Sky;air-baloon+ride;Laptop+Pad;Grass+Green;Deep+;
Arg2 = Blue
 
Test completed.
 

Result = 
vkhandel, Thanks for you code, I have not yet implemented it, will let you know how it goes.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

My test results are as follows.
aString: "Austin+Texas;Blue+Sky;air-baloon+ride;Laptop+Pad;Grass+Green;Deep+;"
aSearchString: "Blue"
Ans: "Sky"

Here is the routine:

Code: Select all

FUNCTION ReturnSecondOfPair(aString, aSearchString) 

* Convert string into dynamic array. 
String = Convert(";", @FM, aString) 

* Find search string within string 
FindStr aSearchString In String Setting FMC,VMC,SMC 
Then 
   Ans = Field(String<FMC>, "+", 2, 1) 
End 
Else 
   Ans = "" 
End 

RETURN(Ans)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Dsnew
Participant
Posts: 90
Joined: Tue Apr 19, 2005 2:08 pm

Post by Dsnew »

It works fine now! Not sure what went wrong with me last time.
Thanks you so much Ray!!!
It worked for all scanario for me, can you please let me know the limitation you said this had, so I can be aware of it.
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

Dsnew wrote:It worked for all scanario for me, can you please let me know the limitation you said this had, so I can be aware of it.
What he means is if you have your search string as a sub-string elsewhere in your name/value list and that substring appears prior to that actual search string e.g.

Stair+1;air+2;

With search string of "air" would get you the 1 and not the 2.

You can for the most cut this off by searching for "<name>+" (this will not work for above example) or almost entirely by searching for ";<name>+" but will need separate handling for the first name/value pair
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Not really separate handling needed, just stick a ';' on the front of the string before searching.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Umm... not so, because the ";" characters have become field marks, and FindStr and related statements don't include these as data values. However, a trailing "+" would do the trick.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Ah... missed that little nuance. Still, bracketing your search string with the internal delimiters can be a handy little trick for stopping substring matches. Generally. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
Dsnew
Participant
Posts: 90
Joined: Tue Apr 19, 2005 2:08 pm

Post by Dsnew »

Kryt0n wrote:What he means is if you have your search string as a sub-string elsewhere in your name/value list and that substring appears prior to that actual search string e.g.

Stair+1;air+2;

With search string of "air" would get you the 1 and not the 2.

You can for the most cut this off by searching for "<name>+" (this will not work for above example) or almost entirely by searching for ";<name>+" but will need separate handling for the first name/value pair
ray.wurlod wrote:However, a trailing "+" would do the trick. ...
How do we handle the above screnario?
Like already mentioned above adding a trailing "+" gives 1 and not 2.
Any help appreciated.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Here is the routine:

Code: Select all

FUNCTION ReturnSecondOfPair(aString, aSearchString) 

* Convert string into dynamic array. 
String = Convert(";", @FM, aString) 
SearchString = aSearchString : "+"

* Find search string within string 
FindStr SearchString In String Setting FMC,VMC,SMC 
Then 
   Ans = Field(String<FMC>, "+", 2, 1) 
End 
Else 
   Ans = "" 
End 

RETURN(Ans) 
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply