Page 1 of 1

Concatenation

Posted: Tue May 23, 2017 6:15 am
by kvenna
Hi,
I have Multiple records with some having same key ID.I want to concatenate them
eg input:
ID Col
1 a
1 b
1 c
1 d
2 d
2 e
2 g
3 e
3 s
3 f


And the output should be like this.

output
ID Col
1 a,b,c,d
2 d,e,g
3 e,s,f

anyone who know the solution please help me with this.

Thanks in advance.

Posted: Tue May 23, 2017 6:30 am
by chulett
That is a vertical pivot (rows to columns) but with... a twist. Rather than use a pivot stage, you'll need to do it yourself in a transformer. Have you ever worked with stage variables to do "group change detection"? In your example, that would let you know when you hit a new ID value. Easy enough to do it in the transformer or you can leverage a Sort stage (even if it doesn't sort) to add a change key for you automatically, then in the transformer you just look for it rather than derive it.

Then do your concatenation. First record of a group? Put it in your output column. Any other record in the same group gets concatenated to the output column with a comma in front of it. Then only output the last record in each group to your target.

There are environment variables that can help with that or you can use an Aggregator and the Last() aggregate function after grouping on ID.

A quick search turned up this post (one of many) on the details of group change detection.

Posted: Tue May 23, 2017 6:57 am
by Mike
The LastRowInGroup() transform function eliminates the need for an aggregator after the transformer. This transform function requires an upstream sort stage (even if it is set to no sort).

Mike

Posted: Tue May 23, 2017 9:09 am
by chulett
Ah, yes - thanks Mike. Had a bit of a <squirrel!> moment and didn't add that in as intended after my search. :D