Need help with pivoting columns
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 103
- Joined: Fri Jan 18, 2008 12:55 am
- Location: chennai
Need help with pivoting columns
Hi I have a condition where....
from input i am getting two columns as below :-
ID NAME
1 XX
1 YY
1 XY
2 SS
2 DD
2 PP
I WANT OUT PUT AS ----------->
ID NAME
1 XX#YY#XY
2 SS#DD#PP
HOW CAN I ACHIEVE THIS....???
I AM USING DS SERVER EDITION.
*Note: Title edited to be more descriptive - Content Editor*
from input i am getting two columns as below :-
ID NAME
1 XX
1 YY
1 XY
2 SS
2 DD
2 PP
I WANT OUT PUT AS ----------->
ID NAME
1 XX#YY#XY
2 SS#DD#PP
HOW CAN I ACHIEVE THIS....???
I AM USING DS SERVER EDITION.
*Note: Title edited to be more descriptive - Content Editor*
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
Re: Need a help
Use the following unix script .This script will satisfy ur requirement. Make the output of this script i.e (outputFile.txt) as input in the data stage job.
Here InputFile.txt will be
ID NAME
1 XX
1 YY
1 XY
2 SS
2 DD
2 PP
awk -F ' *, *' ' END {
while (++i <= c) print order[i], row[order[i]]
}
{
row[$1] = row[$1] ? row[$1] OFS $2 : $2
while (!temp[$1]++) order[++c] = $1
}' InputFile.txt > outputFile.txt
Here InputFile.txt will be
ID NAME
1 XX
1 YY
1 XY
2 SS
2 DD
2 PP
awk -F ' *, *' ' END {
while (++i <= c) print order[i], row[order[i]]
}
{
row[$1] = row[$1] ? row[$1] OFS $2 : $2
while (!temp[$1]++) order[++c] = $1
}' InputFile.txt > outputFile.txt
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Ur was a city in ancient Babylon.
The second person possessive pronoun in English is spelled "your", not "ur". Please strive for a professional standard of written English on DSXchange, because even good English is difficult enough for those whose first language is not English. Further, DSXchange is not an SMS-enabled device.
The second person possessive pronoun in English is spelled "your", not "ur". Please strive for a professional standard of written English on DSXchange, because even good English is difficult enough for those whose first language is not English. Further, DSXchange is not an SMS-enabled device.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
since we have limited number of programmers with UNIX scripting knowledge.
We use a hash file as a solution to this type of problem.
the flat file is read in and it looks up to a hash file that has column 1 as the key. If key found on hash file then we write out to the same hash file where the second column derivation is <Hasfile 1.column 2> : '#' : Input file.column2
Note the look up hash file is the same hash file as the output in this job.
we use an empty file in the job to create the empty lookup hash file
Works fine for us the couple of times we need to use it and the process was contained in the job so oncall support person could see what was going on since running unix scripts from datastage is somewhat hidden to the blurry eyed programmer at 3:00AM.
warning - this will be slower then running a unix script.
We use a hash file as a solution to this type of problem.
the flat file is read in and it looks up to a hash file that has column 1 as the key. If key found on hash file then we write out to the same hash file where the second column derivation is <Hasfile 1.column 2> : '#' : Input file.column2
Note the look up hash file is the same hash file as the output in this job.
we use an empty file in the job to create the empty lookup hash file
Works fine for us the couple of times we need to use it and the process was contained in the job so oncall support person could see what was going on since running unix scripts from datastage is somewhat hidden to the blurry eyed programmer at 3:00AM.
warning - this will be slower then running a unix script.
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
-
- Participant
- Posts: 103
- Joined: Fri Jan 18, 2008 12:55 am
- Location: chennai
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
Code: Select all
yourStgField = if (prevKey = Key)
then inField Else
yourStgField : delim : inField
ID NAME
1 XX
1 YY
1 XY
2 SS
2 DD
2 PP
I WANT OUT PUT AS ----------->
ID NAME
1 XX#YY#XY
2 SS#DD#PP
=================================
transformer :
1. Define three stage variable
===> stgCount number intial value 0
===> stgApndVal vachar (...) initial value '~~~~~~~~~'
===> stgKey
2. Derivation
===> if link.ID <> stgKey THEN 1 ELSE stgCount + 1
===> if link.NAME <> stgKey THEN NAME ELSE stgApndVal:'#':NAME
===> link.ID
3. Add new output field APND_NAME and assign stgApndVal to it
4. Add new output field SRL_NO and assign stgCount to it
remove duplicate stage partion same (since you already partion in ID at the input of transformer) but sort ID and SRL_NO , Key is ID and take the last.
1 XX
1 YY
1 XY
2 SS
2 DD
2 PP
I WANT OUT PUT AS ----------->
ID NAME
1 XX#YY#XY
2 SS#DD#PP
=================================
transformer :
1. Define three stage variable
===> stgCount number intial value 0
===> stgApndVal vachar (...) initial value '~~~~~~~~~'
===> stgKey
2. Derivation
===> if link.ID <> stgKey THEN 1 ELSE stgCount + 1
===> if link.NAME <> stgKey THEN NAME ELSE stgApndVal:'#':NAME
===> link.ID
3. Add new output field APND_NAME and assign stgApndVal to it
4. Add new output field SRL_NO and assign stgCount to it
Code: Select all
transformer -----> remove duplicate ---> target
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Sainath's reply gave me a few ideas...Sainath.Srinivasan wrote:or simply sort the incoming data and append to existing using stage variable.
Something like
sort input -> tx -> hashed file
and in tx do
yourStgField = if (prevKey = Key)
then inField Else
yourStgField : delim : inField
where yourStgField is a stage variable.
My first attempt was to try doing the above without an intermediate hashed file. Instead, I used the RowProcGetPreviousValue and RowProcCompareWithPreviousValue transforms to detect when the key had changed.
The problems were:
1) When the key changed, I was always "one row ahead". I want to output when I'm on the last key in the group, not the first key in the next group.
2) This code didn't work for the last record, since I don't know the end-of-file until after the fact. See viewtopic.php?t=126410&highlight=
So I can see why Sainath uses a hashed file for output: unconditionally output all rows to the hashed file, overwriting each previous value with the concatenated stage variable, which is building up for each row. When the key changes, viola, your last.key and pivoted data was just saved to the hashed file.
However, I also decided to approach from a different tack...
I'm very familiar with SAS programming, which has the concept of first.key and last.key automatic variables. Since DS doesn't have these variables, I decided to construct them.
SAS also has the concept of "interleaving tables", which is a merge by row position rather than by keys. With this in mind, and by saving the Row Number to permanent storage, I was able to create "Look Ahead" and "Look Behind" variables (LookBehind is a bit redundant given RowProcGetPreviousValue). With these I can then derive first.key and last.key boolean flags. I can do this for sub-groups as well.
I've saved my proof of concept job (DSX file you can import) here:
http://docs.google.com/Doc?docid=dcdxxj ... v8dm&hl=en
I've saved the test data I used in this job here:
http://docs.google.com/Doc?docid=dcdxxj ... k3g5&hl=en. The test data is purposely stored unsorted.
This was really just a proof of concept, an academic exercise as much as anything. It might perform like a dog with large data volumes - I haven't tested it. But, it does solve the original poster's problem, and perhaps is a generic approach to first.key and last.key processing.
Or I may have just reinvented a perfectly round wheel if I'd searched the forums a bit harder
HTH...
Last edited by sbass1 on Wed Apr 08, 2009 9:14 pm, edited 1 time in total.
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom