converting rows to columns

bobby · Post by **bobby** » Fri Feb 04, 2005 12:49 pm

Hi ,
I have input
1 n0
1 n1
1 n2
2 b1
2 b2
2 b3

output

1 no, n1,n2
2 b1,b2,b3

how can i achieve that.
Thanks,
Bobby

kcbland · Post by **kcbland** » Fri Feb 04, 2005 12:53 pm

Aggregator is easiest, stage variables can be lengthy and you have that pesky problem of the last row to do with. Setup a transformer that has an output link to an aggregator. Your output link has four columns. For each row, map the source column to the appropriate target column. If that row is not for the given column, put in a zero.

In the aggregator, group by the key and do a max derivation for each of the three columns.

Sainath.Srinivasan · Post by **Sainath.Srinivasan** » Fri Feb 04, 2005 12:55 pm

You can use multi-valued hash-file with first column as its key.

Alternatively, you can store each record into a temp table with commit set to every (1) record, which you then lookup for next record and concatenate to the previous.

I think pivot is only for columns to rows.

talk2shaanc · Post by **talk2shaanc** » Sun Feb 06, 2005 11:22 pm

Use of aggregator is the best and simplest way as Kcbland has suggested. Use of hash file or a database is an extra overhead and its not advisable as its not a good design.
As you can think of if your source file size is huge, then your going to eat up lots of space of your server/Database if you have an intermediate staging area(hash/database) and then again do a join. Here in your example you have given only 1+3 columns. Think of if your no of column is 40-50. you will have 40-50 hash files/DataBase's and that many joins. Your server will shout at you

and go mad

.

Sainath.Srinivasan · Post by **Sainath.Srinivasan** » Mon Feb 07, 2005 3:24 am

By-the-way, to do any number of columns - including upto 1 million columns and more - you need only ONE hash-file or database table by the method of hash-files or database table.

All you are going to do is read-and-write/update the same object to form a string.

The advantage by this method is that you need not worry whether there can be more than 3 entries for one key at any point of time.

The only constraint is the size your db / hash-file can hold.

If you need a detailed explanation of the process, do let me know.

Sainath.Srinivasan · Post by **Sainath.Srinivasan** » Mon Feb 07, 2005 3:30 am

Maybe if you are going down the agg route, you can follow an extra tip of using a stage variable for description and Key as follows.

stgDesc default '' (blank)
stgKey default @NULL (NULL)

stgDesc <- If linkKey = stgKey then stgDesc : ',' : linkDesc Else if IsNULL(stgKey) Then linkDesc Else linkDesc
stgKey = linkKey

The IsNULL is more for belt-and-braces.

You can pass ALL incoming rows to the agg. In the agg stage, you group by the key and use the LAST function for the description.

By this way, you will get the key + full concatenated list of desc.

battaliou · Post by **battaliou** » Mon Feb 07, 2005 5:40 am

You could use BASIC:

Code: Select all

openseq 'in.txt' to f else return
openseq 'out.txt' to t else return
loop
 out = ''
 for i = 1 to 3
  readseq rec from f else return
  if out = '' then out = rec else out := ',':field(rec,' ',2)
 next i
 writeseq out to t else return
until 1=2 do repeat