Merging multiple rows into single row
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 60
- Joined: Thu Dec 17, 2009 1:36 am
Merging multiple rows into single row
Hi,'
I have a requirement to merge multiple rows into one row. I tried using stage variables but it seems that logic is going wrong somewhere.
My Input Data
ColA ColB ColC
1 A 100
1 B 200
1 C 300
Desired Output
1,A,100,B,200,C,300
How can I achieve this?
I tried comparing Currkey = PrevKey and appending values to stage variables, but some of the records give desired output and others not.
Can anybody please help.
I have a requirement to merge multiple rows into one row. I tried using stage variables but it seems that logic is going wrong somewhere.
My Input Data
ColA ColB ColC
1 A 100
1 B 200
1 C 300
Desired Output
1,A,100,B,200,C,300
How can I achieve this?
I tried comparing Currkey = PrevKey and appending values to stage variables, but some of the records give desired output and others not.
Can anybody please help.
-
- Premium Member
- Posts: 60
- Joined: Thu Dec 17, 2009 1:36 am
Thanks for the quick reply.
My job design is like this:
seqfile ->sort ->transformer->remove duplictes->transformer->seqfile
This logic works fine
when
colA colB colC
1 A 10
1 B 20
2 C 30
2 D 40
Output comes as 1,A,10,B,20
2,c,30,D,40
But it fails when
colA colB colC
1 A 10
1 B 20
2 C 30
2 D 40
3 E 50
Output is coming as 1,A,10,B,20
2,C,30,D,40
3,A,C,10,B,20,E,50.
I am doing hash partitioning based on colA.Inside the stage variable if CurrKey = PrevKey then concatenate input columns into the stage variable.
Please let me know where am I going wrong.
Regards,
Gayathri
My job design is like this:
seqfile ->sort ->transformer->remove duplictes->transformer->seqfile
This logic works fine
when
colA colB colC
1 A 10
1 B 20
2 C 30
2 D 40
Output comes as 1,A,10,B,20
2,c,30,D,40
But it fails when
colA colB colC
1 A 10
1 B 20
2 C 30
2 D 40
3 E 50
Output is coming as 1,A,10,B,20
2,C,30,D,40
3,A,C,10,B,20,E,50.
I am doing hash partitioning based on colA.Inside the stage variable if CurrKey = PrevKey then concatenate input columns into the stage variable.
Please let me know where am I going wrong.
Regards,
Gayathri
-
- Participant
- Posts: 134
- Joined: Tue Jun 15, 2010 2:10 am
- Location: Bangalore
Just to make sure nothing wrong in your derivation.Enable KeyChange Column in sort and write your derivation like
Code: Select all
svCurrent = If KeyChange=1 Then ColB:',':ColC Else svPrev:', ':ColB:',':ColC
svPrev=svCurrent
N.Srinivas
India.
India.
-
- Premium Member
- Posts: 60
- Joined: Thu Dec 17, 2009 1:36 am
-
- Premium Member
- Posts: 60
- Joined: Thu Dec 17, 2009 1:36 am
When I have done this sort of thing in the past I have had success but doing the following:
Sort the data by colA,hash partition by colA and set the key change indicator on.
I use three stage vars as follows:
svLastRowInGroup:
LastRowInGroup(colA)
(LastRowInGroup() function is set to 1 for true, zero for false)
svDataOut
if svLastRowInGroup and keyChange then colB:",":colC else if svLastRowInGroup then svDataOut:",":colB:",":colC else ""
(This logic checks single row records as well)
svData
if keyChange then colB:",":colC else svData:",":colB:",":colC
The transformer has a constraint of svLastRowInGroup
Sort the data by colA,hash partition by colA and set the key change indicator on.
I use three stage vars as follows:
svLastRowInGroup:
LastRowInGroup(colA)
(LastRowInGroup() function is set to 1 for true, zero for false)
svDataOut
if svLastRowInGroup and keyChange then colB:",":colC else if svLastRowInGroup then svDataOut:",":colB:",":colC else ""
(This logic checks single row records as well)
svData
if keyChange then colB:",":colC else svData:",":colB:",":colC
The transformer has a constraint of svLastRowInGroup
Bob Oxtoby
Then your logic is flawed. You'd need to supply the details of your job design to allow anyone to help without continuing to guess at this point.gayathrisivakumar wrote:when I try to run it on single node the results are completely different.
What field(s) did you specify as the change keys in the Sort stage? Is the KeyChange column being set correctly? Post your exact stage variables and derivations and the output link constraint you are using.
ps. You don't need "current" and "prev" variables when the Sort stage is setting the KeyChange column.
Last edited by chulett on Wed Aug 28, 2013 7:15 am, edited 1 time in total.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Premium Member
- Posts: 60
- Joined: Thu Dec 17, 2009 1:36 am