Concatenate multirow data into one line

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
satyaatibm
Participant
Posts: 7
Joined: Wed Feb 06, 2013 11:31 am

Concatenate multirow data into one line

Post by satyaatibm »

Hi,

input file:
CLASS, STUDENTS
--------------------------
1,Ram
1,Hari
1,Raj
2,John
2,terry
2,Smith
2,Peter

O/p should be:
CLASS, STUDENTS
--------------------------
1,RamHariRaj
2,JohnterrySmithPeter

I wanted to concatenate all the rows based on the group column which is class here.

Please suggest the possible solution here.

Thanks in Advance.
prasson_ibm
Premium Member
Premium Member
Posts: 536
Joined: Thu Oct 11, 2007 1:48 am
Location: Bangalore

Post by prasson_ibm »

Where is the interview? :wink:
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

What have you tried? What version are you running?
-craig

"You can never have too many knives" -- Logan Nine Fingers
satyaatibm
Participant
Posts: 7
Joined: Wed Feb 06, 2013 11:31 am

Post by satyaatibm »

I tried the below method in 9.1..

SRC-->SORT-->TFM-->Remove Duplicate--> TGT

Sort by class and in TFM, I am taking 3 stage variables as below order
KeyNext=class
concatText=
If keyNext <> keyPre Then STUDENTS else concatText: STUDENTS

keyPre=KeyNext

Here I am getting partial o/p as the first row data (Ram) is not going to the target.
Hope this will help..
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

If you are concatenating in a transformer, then the Remove Duplicates should be set to keep the last duplicate per class - is that the case?
-craig

"You can never have too many knives" -- Logan Nine Fingers
satyaatibm
Participant
Posts: 7
Joined: Wed Feb 06, 2013 11:31 am

Post by satyaatibm »

Yes, I am keeping the last row in the Remove Duplicates stage...

But the o/p i am getting is:
1, HariRaj (Here Ram is missing)
2,JohnterrySmithPeter
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

If you are running on multiple nodes, how are you paritioning? And if so, does it work properly if you run on a single node? You could also cut this back to one stage variable if you let the Sort stage add a Key Change column. You could also leverage Transformer Looping. FYI.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ssnegi
Participant
Posts: 138
Joined: Thu Nov 15, 2007 4:17 am
Location: Sydney, Australia

Reply

Post by ssnegi »

Add transformer with following partition.
class partition hash sort ascending, student no partition sort only ascending.
transformer stage variables :
student : if input.class=classold then studentold:input.student else input.student
classold : input.class
studentold : student
Then add remove duplicate stage with key = class and duplicate to retain last
Partition : same
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Or just add a KeyChange column over Class in the Sort stage. Then just one stage variable will do:

Code: Select all

svStudents: If KeyChange then link.student else svStudents : link.student
Then just make sure your partitioning is correct or run on a single node. And keep the last duplicate. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply