Concatenation of Rows

sengs · Post by **sengs** » Tue Jan 10, 2006 4:13 am

Hi,
I have a requirement of concatenation of values at different rows of a single column.How this can be done in DS.
For Example

NO ........... TYPE
1 ....................A
1.................... B
1.................... C
1.................... D

At the end my requirement is,
1.........................A,B,C,D

Please advice.
Thanks,
Sengs

djm · Post by **djm** » Tue Jan 10, 2006 4:36 am

Hi sengs,

this is often referred to in these forums using the term "vertical pivot". Search the forum and see what you can find. Another search option to try is "man paste" (Craig - don't go there!).

David.

jzparad · Post by **jzparad** » Tue Jan 10, 2006 6:12 am

Requirement:
Vertical pivot based on the value of a pivot field (in this case NUM)
Data file has two columns - NUM, TYPE (although the solution can be tailored for more)

Assumptions:
Number of records with the same value in pivot field is unknown
Pivot field cannot be an empty string

Solution:

Code: Select all

      L1             L2
SEQ---------->TFM---------->HSH



Define sequential file with two columns
NUM varchar 10
TYPE varchar 10


Define Transform as follows

Stage Variables
currentKey
   Initial value = ""
   Derivation = L1.NUM
newRecord
   Initial value = ""
   Derivation = if currentKey <> lastKey Then L1.NUM:",":L1.TYPE else newRecord:",":L1.TYPE
lastKey
   Initial value = ""
   Derivation = currentKey

L2 Deriviations
L2.key = L1.NUM
L2.line = newRecord


Define hash file with two columns
key varchar 10 (marked as the key)
line varchar 200

Description
As each record is read, the value of the pivot field is compared to the previous value.
If the values are different, the newRecord variable is initialised with the NUM and type field in CSV format
If the values are the same, the type field is appended to the newRecord variable in CSV format
Each record is written to the hash file and because the pivot field is used as the key, every record with the same value in the pivot field will overwrite the previous record.

For a file with the following values:

Code: Select all

1,A
1,B
1,C
1,D
2,A

The result is a hash file that looks like this:

Code: Select all

1   1,A,B,C,D
2   2,A

chulett · Post by **chulett** » Tue Jan 10, 2006 7:40 am

D wrote:(Craig - don't go there!)

Doh! Cut off at the knees. Ok, not going there.

vmcburney · Post by **vmcburney** » Tue Jan 10, 2006 5:45 pm

Jim, you should post that to the FAQ forum as the solution for a vertical pivot requirement.

jzparad · Post by **jzparad** » Tue Jan 10, 2006 7:06 pm

Vince,

I'm afraid I'm not privileged enough to do that.

Code: Select all

Sorry, but only users granted special access can post topics in this forum.

Maybe you could do it for me!

jzparad · Post by **jzparad** » Tue Jan 10, 2006 7:09 pm

By the way, I forgot to list one important assumption.

The data has to be sorted by the pivot field.

vmcburney · Post by **vmcburney** » Tue Jan 10, 2006 7:36 pm

Roy! Could you wave your magic wand at Jim and make him a FAQ Genie? His vertical pivot answer beats the FAQ out of the previous responses for completeness.

ray.wurlod · Post by **ray.wurlod** » Tue Jan 10, 2006 7:46 pm

If you use field mark (@FM) rather than comma, your data will magically be in multiple columns in the hashed file even though you only load the key and field #1. You can then read from the hashed file with a full set of column definitions.

jzparad · Post by **jzparad** » Tue Jan 10, 2006 11:12 pm

Ray,

If you use field mark (@FM) rather than comma, your data will magically be in multiple columns in the hashed file even though you only load the key and field #1.

I assume that to do this, you would have to construct a hash file with the correct number of columns first. I tried to simply change the commas to @FM and I got the following result for the same input

Code: Select all

1,1
2,2

ray.wurlod · Post by **ray.wurlod** » Tue Jan 10, 2006 11:49 pm

Aha, that's what you got with View Data.

Add more column definitions - even to the metadata in the Columns grid, and look again!

That is, the Hashed File stage has the ability to populate undefined columns, in exactly the same way that the Repository database tables contain undefined columns that are, nonetheless, accessible. But not by name - columns in hashed files are accessed by ordinal storage position in the record.

jzparad · Post by **jzparad** » Wed Jan 11, 2006 3:15 am

Absolutely right!

I added new columns to the hash file stage and without re-running the job, I could view the data.

sheshadri · Post by **sheshadri** » Sun Nov 26, 2006 1:20 am

sengs wrote:Hi,
Hi fr's thanx to send very nice information.
i got answer to my typical problem.

NO ........... TYPE
1 ....................A
1.................... B
1.................... C
1.................... D

At the end my requirement is,
1.........................A,B,C,D

Please advice.
Thanks,
Sengs

chulett · Post by **chulett** » Sun Nov 26, 2006 8:15 am

Is this a question, reply, some kind of graffiti or... oh, I see - a spirit guide for the OP, returned to let us know all is well almost 11 months later.

Someone needs to learn how to quote more better. That or just use the Reply to Topic button.

nvuradi · Post by **nvuradi** » Tue Nov 06, 2007 5:51 am

jzparad wrote:Requirement:
Vertical pivot based on the value of a pivot field (in this case NUM)
Data file has two columns - NUM, TYPE (although the solution can be tailored for more)

Assumptions:
Number of records with the same value in pivot field is unknown
Pivot field cannot be an empty string

Solution:
Code: Select all
      L1             L2
SEQ---------->TFM---------->HSH



Define sequential file with two columns
NUM varchar 10
TYPE varchar 10


Define Transform as follows

Stage Variables
currentKey
   Initial value = ""
   Derivation = L1.NUM
newRecord
   Initial value = ""
   Derivation = if currentKey <> lastKey Then L1.NUM:",":L1.TYPE else newRecord:",":L1.TYPE
lastKey
   Initial value = ""
   Derivation = currentKey

L2 Deriviations
L2.key = L1.NUM
L2.line = newRecord


Define hash file with two columns
key varchar 10 (marked as the key)
line varchar 200
Description
As each record is read, the value of the pivot field is compared to the previous value.
If the values are different, the newRecord variable is initialised with the NUM and type field in CSV format
If the values are the same, the type field is appended to the newRecord variable in CSV format
Each record is written to the hash file and because the pivot field is used as the key, every record with the same value in the pivot field will overwrite the previous record.

For a file with the following values:
Code: Select all
1,A
1,B
1,C
1,D
2,A
The result is a hash file that looks like this:
Code: Select all
1   1,A,B,C,D
2   2,A

Will this work even in PX jobs, instead of writing into Hash file we can write the data to a Dataset and remove duplicates to retain the last record?

DSXchange

Concatenation of Rows

Concatenation of Rows

Re: Concatenation of Rows