Page 1 of 1

Concatenate multirow data into one line

Posted: Fri Apr 11, 2014 4:57 am
by satyaatibm
Hi,

input file:
CLASS, STUDENTS
--------------------------
1,Ram
1,Hari
1,Raj
2,John
2,terry
2,Smith
2,Peter

O/p should be:
CLASS, STUDENTS
--------------------------
1,RamHariRaj
2,JohnterrySmithPeter

I wanted to concatenate all the rows based on the group column which is class here.

Please suggest the possible solution here.

Thanks in Advance.

Posted: Fri Apr 11, 2014 5:49 am
by prasson_ibm
Where is the interview? :wink:

Posted: Fri Apr 11, 2014 7:21 am
by chulett
What have you tried? What version are you running?

Posted: Fri Apr 11, 2014 9:36 am
by satyaatibm
I tried the below method in 9.1..

SRC-->SORT-->TFM-->Remove Duplicate--> TGT

Sort by class and in TFM, I am taking 3 stage variables as below order
KeyNext=class
concatText=
If keyNext <> keyPre Then STUDENTS else concatText: STUDENTS

keyPre=KeyNext

Here I am getting partial o/p as the first row data (Ram) is not going to the target.
Hope this will help..

Posted: Fri Apr 11, 2014 9:52 am
by chulett
If you are concatenating in a transformer, then the Remove Duplicates should be set to keep the last duplicate per class - is that the case?

Posted: Fri Apr 11, 2014 10:25 am
by satyaatibm
Yes, I am keeping the last row in the Remove Duplicates stage...

But the o/p i am getting is:
1, HariRaj (Here Ram is missing)
2,JohnterrySmithPeter

Posted: Fri Apr 11, 2014 12:00 pm
by chulett
If you are running on multiple nodes, how are you paritioning? And if so, does it work properly if you run on a single node? You could also cut this back to one stage variable if you let the Sort stage add a Key Change column. You could also leverage Transformer Looping. FYI.

Reply

Posted: Fri Apr 11, 2014 4:08 pm
by ssnegi
Add transformer with following partition.
class partition hash sort ascending, student no partition sort only ascending.
transformer stage variables :
student : if input.class=classold then studentold:input.student else input.student
classold : input.class
studentold : student
Then add remove duplicate stage with key = class and duplicate to retain last
Partition : same

Posted: Fri Apr 11, 2014 10:00 pm
by chulett
Or just add a KeyChange column over Class in the Sort stage. Then just one stage variable will do:

Code: Select all

svStudents: If KeyChange then link.student else svStudents : link.student
Then just make sure your partitioning is correct or run on a single node. And keep the last duplicate. :wink: