Data Pivoting/Spliting
Moderators: chulett, rschirm, roy
Data Pivoting/Spliting
Data is coming in the following format:
Column1 Column2
1 Bill,Brain,Greagory
2 Sam, Bopel
3
4 Toby
As you see each row has different number of values. I need to preserve Column1 and associate values.
Output should be as below:
1 Bill
1 Brain
1 Greagory
2 Sam
2 Bopel
3
4 Toby
Please give me some input how to implement in DataStage.
Thank you for valuable input and help.
Column1 Column2
1 Bill,Brain,Greagory
2 Sam, Bopel
3
4 Toby
As you see each row has different number of values. I need to preserve Column1 and associate values.
Output should be as below:
1 Bill
1 Brain
1 Greagory
2 Sam
2 Bopel
3
4 Toby
Please give me some input how to implement in DataStage.
Thank you for valuable input and help.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
-
- Premium Member
- Posts: 67
- Joined: Thu Aug 09, 2007 7:51 pm
Hi Sainath,Sainath.Srinivasan wrote:Or join with a row generator and do a split the fields.
I have a similar requirement and I am trying to follow your note. How does a row generator help? What is the column to join? Since Row Generator generates arbitrary (mock) data how could we use it for the task at hand?
Thanks,
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
Row generator will produce 2 columns which are
1.) a dummy column (say constant 1)
2.) sequence from 1 to n
(n = max possible list of values in the column you wish to split)
Add the same dummy column to your input row.
Join them, use the sequence column to field() and add necessary constraints.
You will have the required output.
1.) a dummy column (say constant 1)
2.) sequence from 1 to n
(n = max possible list of values in the column you wish to split)
Add the same dummy column to your input row.
Join them, use the sequence column to field() and add necessary constraints.
You will have the required output.
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
-
- Premium Member
- Posts: 67
- Joined: Thu Aug 09, 2007 7:51 pm
Hi Sainath,Sainath.Srinivasan wrote:Row generator will produce 2 columns which are
1.) a dummy column (say constant 1)
2.) sequence from 1 to n
(n = max possible list of values in the column you wish to split)
Add the same dummy column to your input row.
Join them, use the sequence column to field() and add necessary constraints.
You will have the required output.
Thanks! I tried to follow the suggestion with a paper and pencil. I am stuck.
For the example in question, here is what I have
Row Generator (for the second column, the limit is set to 4)
Col1 Col2
1 1
1 2
1 3
1 4
1 1
1 2
1 3
1 4
The input (as in the original question) is
colX colY
1 Bill,Brain,Greagory
2 Sam, Bopel
3
4 Toby
In the Join stage, I was wondering if I should be joining col2 with colX? Sorry I dont get it.
Thanks,
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
====
Input:
====
Step1:
So generate 3 rows from rowgen.
=====
rowgen:
=====
Step2:
Add dummy column to input
Join by colDummy
derive columns
Step 5:
Input:
====
Here the max occurance = 3 (for row 1).colX colY
1 Bill,Brain,Greagory
2 Sam, Bopel
3
4 Toby
Step1:
So generate 3 rows from rowgen.
=====
rowgen:
=====
colDummy colIndex
1 1
1 2
1 3
Step2:
Add dummy column to input
Step3:colDummy colX colY
1 1 Bill,Brain,Greagory
1 2 Sam, Bopel
1 3
1 4 Toby
Join by colDummy
Step 4:This will give you (4 x 3 = ) 12 rows.
derive columns
Code: Select all
out.colX = inp.colX
newColY = field(inp.colY, ',', colIndex)
Note - here you will have to control for the number of entries in the row.
Step 6:You have your output.
Code: Select all
Leave the money in usual place.
-
- Premium Member
- Posts: 67
- Joined: Thu Aug 09, 2007 7:51 pm