Horizontal Pivot
Moderators: chulett, rschirm, roy
Horizontal Pivot
Hi all,
Here's the requirement that i'm facing:
I have an input file with two columns :
For example, the first column contains :
"a"
The second column contains :
"1col1|2col2|3col3".
Do you know how i can do in the "derivation" of my pivot to have in my output file :
"a;1col1"
"a;2col2"
"a;2col3"
(use the "|" in the derivation???)
Thanks in advance,
cool
Here's the requirement that i'm facing:
I have an input file with two columns :
For example, the first column contains :
"a"
The second column contains :
"1col1|2col2|3col3".
Do you know how i can do in the "derivation" of my pivot to have in my output file :
"a;1col1"
"a;2col2"
"a;2col3"
(use the "|" in the derivation???)
Thanks in advance,
cool
Use the Pivot stage to do this. It shouldnt be that hard. Try to come up with a solution, then go on from there.
PS: Pivot Stage is a very sensitive stage to use and it totally positional. It does what you say to do.
PS: Pivot Stage is a very sensitive stage to use and it totally positional. It does what you say to do.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Another option.
User transformer to Split the second column into three.for the first link for the second link for the second link.
User transformer to Split the second column into three.
Code: Select all
In.Column[1,INDEX(In.Column,'|',1)-1]
Code: Select all
In.Column[INDEX(In.Column,'|',2),(INDEX(In.Column,'|',2)-INDEX(In.Column,'|',1)-)]
Code: Select all
In.Column[INDEX(In.Column,'|',3),(Len(In.Column)-INDEX(In.Column,'|',3))]
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
thanks for you help kumar_s, but in my job i must not to know the number of columns in my output file, i must to create a output file with a number of columns dynamically. That's why, i think the pivot is the best solution but i don't know what i put in the derivation...???kumar_s wrote:Another option.
User transformer to Split the second column into three.for the first linkCode: Select all
In.Column[1,INDEX(In.Column,'|',1)-1]
for the second linkCode: Select all
In.Column[INDEX(In.Column,'|',2),(INDEX(In.Column,'|',2)-INDEX(In.Column,'|',1)-)]
for the second link.Code: Select all
In.Column[INDEX(In.Column,'|',3),(Len(In.Column)-INDEX(In.Column,'|',3))]
Re: Horizontal Pivot
basic wrote:Hi all,
Here's the requirement that i'm facing:
I have an input file with two columns :
For example, the first column contains :
"a"
The second column contains :
"1col1|2col2|3col3".
Do you know how i can do in the "derivation" of my pivot to have in my output file :
"a;1col1"
"a;2col2"
"a;3col3" //correction
(use the "|" in the derivation???)
Thanks in advance,
cool
-
- Participant
- Posts: 407
- Joined: Mon Jun 27, 2005 8:54 am
- Location: Walker, Michigan
- Contact:
Re: Horizontal Pivot
I don't know how to do this with pivot stage.
However, you could:
Use a sequential file stage with a delimeter that does not occur in the file. Maybe \xF8 Also, set Quote to none. Store this in one column. Example col1 varchar without any length.
Send col1 down a link to an external filter. Run this code in the filter.
Send the filtered results down a link to the column import. Create an output link to the destination. In the column import stage, set the delimeter to ";" and Quote to none. Also, create the columns in the output that you need based upon the assignment names.
However, you could:
Use a sequential file stage with a delimeter that does not occur in the file. Maybe \xF8 Also, set Quote to none. Store this in one column. Example col1 varchar without any length.
Send col1 down a link to an external filter. Run this code in the filter.
Code: Select all
## AUTHOR: Ryan T. Putnam
awk 'BEGIN {FS=";"; OFS=";";}
{
START=0;
LEN=index(substr($2, START), "|");
START=START + LEN + 1;
if (( LEN == 0 ))
{
print $1,$2;
}
else
{
print $1,substr($2, 1, LEN -1);
while (LEN > 0)
{
LEN=index(substr($2, START), "|");
if (( LEN > 0 ))
{
print $1,substr($2, START, LEN -1);
}
else
{
print $1,substr($2, START);
}
START=START + LEN;
}
}
}'
Hi,basic wrote: That's why, i think the pivot is the best solution but i don't know what i put in the derivation...???
In the derivations write the column names seperated by comma.
if u have 4 columns (Key1,Col1,col2,col3) then
in the output create "Key1" and "CombinedColOp" fields.
In derivation of "CombinedColOp" write the columns names seperated by coma i.e. "Col1,col2,col3"
Regards
Senthil
A much simpler awk script that that listed above would achieve what you need. Have a look at this thread. There is a reply that I contributed that is very close to what you are after. Tweak the awk script in that and you will have what you need. Suggestion: if you end up with an awk script with more than about 3 lines as a solution for this, you are probably not doing it very efficiently.
David
David
(Previously known as D)
Be alturistic and donate your spare CPU cycles to research. http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1
Be alturistic and donate your spare CPU cycles to research. http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1
-
- Participant
- Posts: 407
- Joined: Mon Jun 27, 2005 8:54 am
- Location: Walker, Michigan
- Contact:
I will say that this 3 liner below is much easier to read, but is much slower. It takes 1.5 seconds for a 70000 record test I ran vs. the longer version which takes only 1 second. The mulitple delimeters have my head spinning.
I am not sure I coded this with 100% efficiency (as it runs slower than the previous version), but I did want to comply with the 3 line rule.
Code: Select all
## Author Ryan T. Putnam
awk 'BEGIN {FS=";"; OFS=";";}
{
TO=split($2, ARRAY, "|");
if ( TO == 0 ) { print $1 OFS; }
for (i=1;i<=TO;i++) { print $1,ARRAY[i]; }
}'
-
- Participant
- Posts: 407
- Joined: Mon Jun 27, 2005 8:54 am
- Location: Walker, Michigan
- Contact:
DJM's solution he provided earilier is much faster. I find it more difficult to read, but it runs in .5 seconds. I think he was being modest by saying 3 lines because his solution will fit on one line very nicely.
DJM's solution modified by Ryan.
DJM's solution modified by Ryan.
Code: Select all
awk 'BEGIN {FS=";"; OFS=";";} { gsub(/|/, "\n" $1 OFS); print $0; }'
Ahh, but my programming layout style would have laid out your "core" code like so:
for maintainability reasons.
I don't know whether that makes mine 2 or 4 lines! By the way, nice implementation of a "leave this as an exercise for the reader" :D
David
Code: Select all
{
gsub(/|/, "\n" $1 OFS);
print $0;
}
I don't know whether that makes mine 2 or 4 lines! By the way, nice implementation of a "leave this as an exercise for the reader" :D
David
(Previously known as D)
Be alturistic and donate your spare CPU cycles to research. http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1
Be alturistic and donate your spare CPU cycles to research. http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1