Concatenation of Rows
Moderators: chulett, rschirm, roy
Concatenation of Rows
Hi,
I have a requirement of concatenation of values at different rows of a single column.How this can be done in DS.
For Example
NO ........... TYPE
1 ....................A
1.................... B
1.................... C
1.................... D
At the end my requirement is,
1.........................A,B,C,D
Please advice.
Thanks,
Sengs
I have a requirement of concatenation of values at different rows of a single column.How this can be done in DS.
For Example
NO ........... TYPE
1 ....................A
1.................... B
1.................... C
1.................... D
At the end my requirement is,
1.........................A,B,C,D
Please advice.
Thanks,
Sengs
Requirement:
Vertical pivot based on the value of a pivot field (in this case NUM)
Data file has two columns - NUM, TYPE (although the solution can be tailored for more)
Assumptions:
Number of records with the same value in pivot field is unknown
Pivot field cannot be an empty string
Solution:
Description
As each record is read, the value of the pivot field is compared to the previous value.
If the values are different, the newRecord variable is initialised with the NUM and type field in CSV format
If the values are the same, the type field is appended to the newRecord variable in CSV format
Each record is written to the hash file and because the pivot field is used as the key, every record with the same value in the pivot field will overwrite the previous record.
For a file with the following values:
The result is a hash file that looks like this:
Vertical pivot based on the value of a pivot field (in this case NUM)
Data file has two columns - NUM, TYPE (although the solution can be tailored for more)
Assumptions:
Number of records with the same value in pivot field is unknown
Pivot field cannot be an empty string
Solution:
Code: Select all
L1 L2
SEQ---------->TFM---------->HSH
Define sequential file with two columns
NUM varchar 10
TYPE varchar 10
Define Transform as follows
Stage Variables
currentKey
Initial value = ""
Derivation = L1.NUM
newRecord
Initial value = ""
Derivation = if currentKey <> lastKey Then L1.NUM:",":L1.TYPE else newRecord:",":L1.TYPE
lastKey
Initial value = ""
Derivation = currentKey
L2 Deriviations
L2.key = L1.NUM
L2.line = newRecord
Define hash file with two columns
key varchar 10 (marked as the key)
line varchar 200
As each record is read, the value of the pivot field is compared to the previous value.
If the values are different, the newRecord variable is initialised with the NUM and type field in CSV format
If the values are the same, the type field is appended to the newRecord variable in CSV format
Each record is written to the hash file and because the pivot field is used as the key, every record with the same value in the pivot field will overwrite the previous record.
For a file with the following values:
Code: Select all
1,A
1,B
1,C
1,D
2,A
Code: Select all
1 1,A,B,C,D
2 2,A
Jim Paradies
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
Jim, you should post that to the FAQ forum as the solution for a vertical pivot requirement.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Vince,
I'm afraid I'm not privileged enough to do that.
Maybe you could do it for me!
I'm afraid I'm not privileged enough to do that.
Code: Select all
Sorry, but only users granted special access can post topics in this forum.
Jim Paradies
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
Roy! Could you wave your magic wand at Jim and make him a FAQ Genie? His vertical pivot answer beats the FAQ out of the previous responses for completeness.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
If you use field mark (@FM) rather than comma, your data will magically be in multiple columns in the hashed file even though you only load the key and field #1. You can then read from the hashed file with a full set of column definitions.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Ray,
I assume that to do this, you would have to construct a hash file with the correct number of columns first. I tried to simply change the commas to @FM and I got the following result for the same inputIf you use field mark (@FM) rather than comma, your data will magically be in multiple columns in the hashed file even though you only load the key and field #1.
Code: Select all
1,1
2,2
Jim Paradies
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Aha, that's what you got with View Data.
Add more column definitions - even to the metadata in the Columns grid, and look again!
That is, the Hashed File stage has the ability to populate undefined columns, in exactly the same way that the Repository database tables contain undefined columns that are, nonetheless, accessible. But not by name - columns in hashed files are accessed by ordinal storage position in the record.
Add more column definitions - even to the metadata in the Columns grid, and look again!
That is, the Hashed File stage has the ability to populate undefined columns, in exactly the same way that the Repository database tables contain undefined columns that are, nonetheless, accessible. But not by name - columns in hashed files are accessed by ordinal storage position in the record.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Re: Concatenation of Rows
sengs wrote:Hi,
Hi fr's thanx to send very nice information.
i got answer to my typical problem.
NO ........... TYPE
1 ....................A
1.................... B
1.................... C
1.................... D
At the end my requirement is,
1.........................A,B,C,D
Please advice.
Thanks,
Sengs
SHESHU
Is this a question, reply, some kind of graffiti or... oh, I see - a spirit guide for the OP, returned to let us know all is well almost 11 months later.
Someone needs to learn how to quote more better. That or just use the Reply to Topic button.
Someone needs to learn how to quote more better. That or just use the Reply to Topic button.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Will this work even in PX jobs, instead of writing into Hash file we can write the data to a Dataset and remove duplicates to retain the last record?jzparad wrote:Requirement:
Vertical pivot based on the value of a pivot field (in this case NUM)
Data file has two columns - NUM, TYPE (although the solution can be tailored for more)
Assumptions:
Number of records with the same value in pivot field is unknown
Pivot field cannot be an empty string
Solution:DescriptionCode: Select all
L1 L2 SEQ---------->TFM---------->HSH Define sequential file with two columns NUM varchar 10 TYPE varchar 10 Define Transform as follows Stage Variables currentKey Initial value = "" Derivation = L1.NUM newRecord Initial value = "" Derivation = if currentKey <> lastKey Then L1.NUM:",":L1.TYPE else newRecord:",":L1.TYPE lastKey Initial value = "" Derivation = currentKey L2 Deriviations L2.key = L1.NUM L2.line = newRecord Define hash file with two columns key varchar 10 (marked as the key) line varchar 200
As each record is read, the value of the pivot field is compared to the previous value.
If the values are different, the newRecord variable is initialised with the NUM and type field in CSV format
If the values are the same, the type field is appended to the newRecord variable in CSV format
Each record is written to the hash file and because the pivot field is used as the key, every record with the same value in the pivot field will overwrite the previous record.
For a file with the following values:The result is a hash file that looks like this:Code: Select all
1,A 1,B 1,C 1,D 2,A
Code: Select all
1 1,A,B,C,D 2 2,A