Split a delimited column based a value in another column

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

anu123
Premium Member
Premium Member
Posts: 143
Joined: Sun Feb 05, 2006 1:05 pm
Location: Columbus, OH, USA

Split a delimited column based a value in another column

Post by anu123 »

I want to split values in column CODES and produce output based the numbers/order listed in 3rd column SEQUENCE. Below is an example of input and output I need to produce. Please help to find a solution for this.

Input:

Num|Codes|Sequence
101|a,b,c|2,3
102|m,n,o,p|3,4
103|x,y,z|3


Required Output:

Num|Codes_to_review
101|b||'~'||c
102|o||'~'||p
103|z


Thanks in advance,

Anu
Thank you,
Anu
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Guessing this really doesn't have anything to do with ProfileStage, so moved your post here.

Off the top of my head, seems like the Field() function would work for this, in conjunction with transformer looping.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Doesn't need looping. I'm also assuming that you're intending to show concatenation with your use of "||" (though the DataStage concatenation operator is ":").

Code: Select all

If Index(Sequence, ",", 1) = 0 Then Field(Codes, ",", Sequence, 1) Else Field(Codes, ",", Field(Sequence, ",", 1, 1)) : "~" : Field(Codes, ",", Field(Sequence, ",", 2, 1))
If you need the "||" characters, simply change the middle operand from "~" to "||~||".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
anu123
Premium Member
Premium Member
Posts: 143
Joined: Sun Feb 05, 2006 1:05 pm
Location: Columbus, OH, USA

Post by anu123 »

Thank you Ray & Craig for quick response.

Ray, you are right, I want to concatenate.
Is there any way to handle varying count in sequence col, I mean, I can have two sequences in row-1 and four sequences in second row?

Or do I need to code for Max no. of Sequence and then trim '~'?

Thanks,



Thank you.
Thank you,
Anu
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I read the specification as you wanting just two (or one).

Obviously a more generic solution will be more complex. If it's to be no
more than four you could probably stick with the If..Then..Else construct.

Otherwise you are probably going to need some kind of looping solution.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Aha! :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

... or a routine.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Of course.
-craig

"You can never have too many knives" -- Logan Nine Fingers
anu123
Premium Member
Premium Member
Posts: 143
Joined: Sun Feb 05, 2006 1:05 pm
Location: Columbus, OH, USA

Post by anu123 »

Ray, looks like Index() take three params...Index (string, substring, instance) and you have four in there..

Can you please explain why you want me to add below one to over all logic?

Index(Sequence, ",", 1) = 0 Then
Thank you,
Anu
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

anu123 wrote:Ray, looks like Index() take three params...Index (string, substring, instance) and you have four in there..
No, he has three.
-craig

"You can never have too many knives" -- Logan Nine Fingers
anu123
Premium Member
Premium Member
Posts: 143
Joined: Sun Feb 05, 2006 1:05 pm
Location: Columbus, OH, USA

Post by anu123 »

Craig,

As per Ray's response....
If Index(Sequence, ",", 1) = 0 Then Field(Codes, ",", Sequence, 1) Else Field(Codes, ",", Field(Sequence, ",", 1, 1)) : "~" : Field(Codes, ",", Field(Sequence, ",", 2, 1))



But as per the manual, Index() takes just three inputs in total as below.

Index (string, substring, instance).

Thanks,
Thank you,
Anu
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Which is what he has... look more closer, you can't just count the commas:

Index(Sequence, ",", 1)

1st parameter: Sequence
2nd parameter: ","
3rd parameter: 1
-craig

"You can never have too many knives" -- Logan Nine Fingers
anu123
Premium Member
Premium Member
Posts: 143
Joined: Sun Feb 05, 2006 1:05 pm
Location: Columbus, OH, USA

Post by anu123 »

Thank you Craig. I am stupid and was thinking it single quotes and checking for empty.......

thanks for the explaining this to me.

Anu
Thank you,
Anu
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Nothing stupid about missing this at all, it can be confusing when you have mixtures of quotes and commas. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
anu123
Premium Member
Premium Member
Posts: 143
Joined: Sun Feb 05, 2006 1:05 pm
Location: Columbus, OH, USA

Post by anu123 »

Thank you for encouraging me, Craig.
Thank you,
Anu
Post Reply