Parsing of string

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
raghu22
Premium Member
Premium Member
Posts: 15
Joined: Sun Oct 11, 2009 11:33 am

Parsing of string

Post by raghu22 »

Hi I'm new to the data stage, I had this requirement. Input field is coming from DB2

Input fields
AB123JK45
RF45TJ5678
FS245


Now I need to split each field like this

AB123JK45 into AB1,AB2,AB3,JK4,JK5
RF45TJ5678 into RF4,RF5,TJ5,TJ6,TJ7,TJ8
FS245 into FS2,FS4,FS5

Any Help is highly appreciated....


chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

How about you spell out your requirements in words for us rather than making people puzzle it out from your data? Examples are always important to have but they don't always stand well all on their lonesome.

It looks like you need to find alpha character(s) and then repeat them with each single digit number until you hit the next alpha character(s). Lather, rinse, repeat. Any idea on the maximum number of output columns you could have per row? :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
raghu22
Premium Member
Premium Member
Posts: 15
Joined: Sun Oct 11, 2009 11:33 am

Post by raghu22 »

Hi craig, Sorry abt it... from next time onwards I will correct it.As of now, the data which i came across got 10 columns, but it is variable data ... it may go more than 10 columns also
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You still need to spell out your requirements for us. I took a guess, how about detailing what it really is?
-craig

"You can never have too many knives" -- Logan Nine Fingers
raghu22
Premium Member
Premium Member
Posts: 15
Joined: Sun Oct 11, 2009 11:33 am

Post by raghu22 »

Hi Craig,thanks for the quick reply... I'm getting the input row in this format it contains 2 alphabets followed by numbers between 1 and 7 and again 2 alphabets followed by numbers between 1 and 7.

Now I need to parse this input into several output rows in this format 2 alphabets followed by each number occurrence

suppose this is input table format

ID Code
1 AB1234567DF345
2 FD357DC34
3 DS234


Now this is my requirement

ID Code
1 AB1
1 AB2
1 AB3
1 AB4
1 AB5
1 AB6
1 AB7
1 DF3
1 DF4
1 DF5
2 FD3
2 FD5
2 FD7
2 DC3
2 DC4
3 DS2
3 DS3
3 DS4
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Is it always the case that there are precisely two alphabetic pairs per record?

Would you be prepared to use a BASIC Transformer stage or a server job?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I smell a pivot coming... but you need to answer Ray's questions before we go too much further.
-craig

"You can never have too many knives" -- Logan Nine Fingers
raghu22
Premium Member
Premium Member
Posts: 15
Joined: Sun Oct 11, 2009 11:33 am

Post by raghu22 »

Hi ray, Yes it is precisely 2 alphabets followed by the number. Not yet decided which job should I go, Plz let me know how to do it
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Split each record into two records beginning with the two alphabetic characters and preserving the ID. Parse the new records so that they have the form ID, Letters, number, number, number... Funnel the two streams back together sort/merge on ID and letters. Pivot.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
raghu22
Premium Member
Premium Member
Posts: 15
Joined: Sun Oct 11, 2009 11:33 am

Post by raghu22 »

Thanks ray... let me try it
raghu22
Premium Member
Premium Member
Posts: 15
Joined: Sun Oct 11, 2009 11:33 am

Post by raghu22 »

Hi ray, can u kindly explain me which stages should i use.... using transformer stage or restructure stages?

Thanks for helping me
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Transformer stage to divide each record into two, each beginning with two letters, onto two output links.
Funnel stage to reinstate a single stream.
Column Import stage to parse the numeric characters into separate fields. (Or that could have been done in the Transformer stage.)
Pivot stage to produce the required rows.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
raghu22
Premium Member
Premium Member
Posts: 15
Joined: Sun Oct 11, 2009 11:33 am

Post by raghu22 »

Thanks alot...I'm really thankful to you. And last question is there any function available in transformer based on the pattern search to split the record
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Code: Select all

svSecondAlpha      Right(Convert("0123456789","",InLink.TheString),2)
svSecondAlphaPos   Index(InLink.TheString, svSecondAlpha, 1)
svFirstPiece       InLink.TheString[1,svSeconAlphaPos - 1]
svSecondPiece      InLink.TheString[svSecondAlphaPos, Len(InLink.TheString) - Len(svFirstPiece)]
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
raghu22
Premium Member
Premium Member
Posts: 15
Joined: Sun Oct 11, 2009 11:33 am

Post by raghu22 »

Thanks alot
Post Reply