vertical pivot

smitageorge · Post by **smitageorge** » Wed Mar 15, 2006 1:27 pm

[quote="ray.wurlod" Challenge that decision. [/quote]

If I would have been RAY WURLORD i could had challenged the decision but i am not

Smita

ray.wurlod · Post by **ray.wurlod** » Wed Mar 15, 2006 2:03 pm

You can challenge any decision when you're in the right.

That fact that you challenge a foolish decision doesn't mean that the fools who made it will change their minds, but even getting them to think about it is a start.

kcbland · Post by **kcbland** » Wed Mar 15, 2006 2:40 pm

Smita, the indicated post shows how to use stage variables. You must sort the data by the grouping keys and partition by them so that they stay together on the same node. The only issue is the last group won't output without some help. The "trigger" event on the stage variables is to only output the final concatenated row when the group changes, but the last group has no next group to indicate a group change. You'll have to add a row to the input for each node so that each node will output its final group. The stage variable recommendation is not the easiest or cleanest solution.

Sunshine2323 · Post by **Sunshine2323** » Thu Mar 16, 2006 12:42 am

Hi Smita,

Job Design:

Code: Select all

Input----->Transformer---->Remove Duplicate----->Output

Declare stage variables as follows:
StageVar=DSLink3.No
StageVar2=StageVar<> StageVar1
StageVar1=StageVar
StageVar5=DSLink3.Type:',':DSLink3.Name1:',': DSLink3.Name2
StageVar4=IF StageVar2=1 then DSLink3.No:',': StageVar5 else Trim(StageVar4):',':Trim(StageVar5)

Output has two column with the below derivation:
No=DSLink3.No
Output=StageVar4
Check for the field length of the Output Column to be suficient to hold the concatenated data

Remove Duplicate Stage:
Pass the rows through a Remove Duplicate Stage and give the key as NO and select Last in the property Duplicate to Retain

Tested With:

Code: Select all

1|A|AA|W 
1|B|BB|N
1|C|CC|Y 
1|D|DD|Z
2|A|AA|W 
2|B|BB|N
2|C|CC|Y

Output:

Code: Select all

1,A,AA,W,B,BB,N,C,CC,Y,D,DD,Z
2,A,AA,W,B,BB,N,C,CC,Y

Hope this helps

kcbland · Post by **kcbland** » Thu Mar 16, 2006 8:25 am

Sunshine2323

You have made my day.

Excellent technique, but please consider the partitioning required on a multi-node cluster to keep grouped rows together. Otherwise, the "last" aggregate solution doesn't require the final row output fix. Excellent.

The same solution would work with Server by changing the Remove Duplicates stage with an Aggregator stage using the "last" action.

DSguru2B · Post by **DSguru2B** » Thu Mar 16, 2006 8:40 am

smitageorge wrote:
My exact requirement is like this:

source:

NO TYPE NAME1 NAME2
1 A AA W
1 B BB
1 C CC Y
1 D DD Z

Create Temp with metadata like this

NO Type1 Type2 Type3 Type4 Type5 Type6 Type7 Type8 Type9 ....
1 A AA W B BB C CC Y ....

Thanks
smita

Yes, create it like as you have mentioned. Then from the Temp file pass it to a Transformer where if you want to concatenate the Type1 with Type2 with Type 3 and so on.
And there you have it ready for inserting to your target.
Sunshines method is a possible approach too, but I think this is much more simpler if you are not comfortable with staging variables. If you have a very high amount of data then staging variables might hit performance issues.
Let me know how it goes.

smitageorge · Post by **smitageorge** » Thu Mar 16, 2006 11:38 am

Sunshine2323 wrote:Hi Smita,

Job Design:
Code: Select all
Input----->Transformer---->Remove Duplicate----->Output
Declare stage variables as follows:
StageVar=DSLink3.No
StageVar1=StageVar<> StageVar1
StageVar2=StageVar
StageVar3=DSLink3.Type:',':DSLink3.Name1:',': DSLink3.Name2
StageVar4=IF StageVar2=1 then DSLink3.No:',': StageVar5 else Trim(StageVar4):',':Trim(StageVar5)

Output has two column with the below derivation:
No=DSLink3.No
Output=StageVar4
Check for the field length of the Output Column to be suficient to hold the concatenated data

Remove Duplicate Stage:
Pass the rows through a Remove Duplicate Stage and give the key as NO and select Last in the property Duplicate to Retain

Tested With:
Code: Select all
1|A|AA|W 
1|B|BB|N
1|C|CC|Y 
1|D|DD|Z
2|A|AA|W 
2|B|BB|N
2|C|CC|Y
Output:
Code: Select all
1,A,AA,W,B,BB,N,C,CC,Y,D,DD,Z
2,A,AA,W,B,BB,N,C,CC,Y
Hope this helps

Hi

What if we need to put that in different columns instead of comma?

Thanks
smita

DSguru2B · Post by **DSguru2B** » Thu Mar 16, 2006 11:48 am

Then donot concatenate the stage variables as sunshine has posted.
Instead just pass each stage var into a column.

vinaymanchinila · Post by **vinaymanchinila** » Mon Aug 28, 2006 1:27 pm

Hi,
I read through the post but did not get the StageVar5 part, what if we have unknown/varying numbers.

Below is my source, and I need to concatenate col03 based on Col01 and Col02. But the number of rows are changing for Col1.

Col01 Col03
S1 1232
S1 123w
S1 asde
S1 dewe

S2 asas
S2 aswe
S2 wewe

S3 4545
S3 5454
S3 5656
S3 6565
S3 6335

Required Ouptput
Col01 Col02
S1 1232123wasdedewe
S2 asasaswewewe
S3 45455454565665656335

Thanks,

mansoor_nb · Post by **mansoor_nb** » Mon Aug 28, 2006 3:32 pm

Smitha,
You can use the Combined Record stage. where in you have to create a subrecord column for the key.In your case the key is NO.

vinaymanchinila · Post by **vinaymanchinila** » Mon Aug 28, 2006 5:44 pm

Hi Mansoor,
I did play around with the combine record stage, but I am not sure how it will combine the columns into one row.

Thanks,

kumar_s · Post by **kumar_s** » Tue Aug 29, 2006 2:43 am

Below is my source, and I need to concatenate col03 based on Col01 and Col02.

But I couldnt see any Col2 in your example.

If it is simple concatination, you can acheive using transformer.
Sort baseded on Col1. Use stage variable to check for repeated value in Col1, and based on that, concatinate the Col3 with previuos value.
Something like this.

Code: Select all

v_Col1_Cur = Link.Col1

vConcatinated = If v_Col1_Cur = vCol1_Previous Then vConcatinated:Link.Col3 Else Link.Col3

vCol1_Previous = v_Col1_Cur

Assing vConcatinated to Col2 and find the max using removeduplicate sage or use sort again to retain the first/last record.

vinaymanchinila · Post by **vinaymanchinila** » Tue Aug 29, 2006 5:13 am

Hi Kumar,
I am using a composite key (Col01) , thanks for the idea, also do I need to add a dummy row at the end so that the last record is captured?

Thanks again.

kumar_s · Post by **kumar_s** » Tue Aug 29, 2006 6:27 am

I dont find any reason to add an additional dummy row. The last row in the group should have all the concatinated value.