Horizontal Pivot

basic · Post by **basic** » Wed Mar 22, 2006 7:53 am

Hi all,

Here's the requirement that i'm facing:

I have an input file with two columns :

For example, the first column contains :
"a"

The second column contains :

"1col1|2col2|3col3".

Do you know how i can do in the "derivation" of my pivot to have in my output file :

"a;1col1"
"a;2col2"
"a;2col3"

(use the "|" in the derivation???)

Thanks in advance,

cool

DSguru2B · Post by **DSguru2B** » Wed Mar 22, 2006 8:40 am

Use the Pivot stage to do this. It shouldnt be that hard. Try to come up with a solution, then go on from there.
PS: Pivot Stage is a very sensitive stage to use and it totally positional. It does what you say to do.

kumar_s · Post by **kumar_s** » Wed Mar 22, 2006 9:02 am

Another option.
User transformer to Split the second column into three.

Code: Select all

In.Column[1,INDEX(In.Column,'|',1)-1]

for the first link

Code: Select all

In.Column[INDEX(In.Column,'|',2),(INDEX(In.Column,'|',2)-INDEX(In.Column,'|',1)-)]

for the second link

Code: Select all

In.Column[INDEX(In.Column,'|',3),(Len(In.Column)-INDEX(In.Column,'|',3))]

for the second link.

basic · Post by **basic** » Wed Mar 22, 2006 9:12 am

kumar_s wrote:Another option.
User transformer to Split the second column into three.
Code: Select all
In.Column[1,INDEX(In.Column,'|',1)-1] 
for the first link
Code: Select all
In.Column[INDEX(In.Column,'|',2),(INDEX(In.Column,'|',2)-INDEX(In.Column,'|',1)-)]
for the second link
Code: Select all
In.Column[INDEX(In.Column,'|',3),(Len(In.Column)-INDEX(In.Column,'|',3))]  
for the second link.

thanks for you help kumar_s, but in my job i must not to know the number of columns in my output file, i must to create a output file with a number of columns dynamically. That's why, i think the pivot is the best solution but i don't know what i put in the derivation...???

basic · Post by **basic** » Wed Mar 22, 2006 9:54 am

basic wrote:Hi all,

Here's the requirement that i'm facing:

I have an input file with two columns :

For example, the first column contains :
"a"

The second column contains :

"1col1|2col2|3col3".

Do you know how i can do in the "derivation" of my pivot to have in my output file :

"a;1col1"
"a;2col2"
"a;3col3" //correction

(use the "|" in the derivation???)

Thanks in advance,

cool

Ultramundane · Post by **Ultramundane** » Wed Mar 22, 2006 12:02 pm

I don't know how to do this with pivot stage.

However, you could:
Use a sequential file stage with a delimeter that does not occur in the file. Maybe \xF8 Also, set Quote to none. Store this in one column. Example col1 varchar without any length.

Send col1 down a link to an external filter. Run this code in the filter.

Code: Select all

## AUTHOR: Ryan T. Putnam

awk 'BEGIN {FS=";"; OFS=";";}
{
   START=0;
   LEN=index(substr($2, START), "|");
   START=START + LEN + 1;

   if (( LEN == 0 ))
   {
      print $1,$2;
   }
   else
   {
      print $1,substr($2, 1, LEN -1);
      while (LEN > 0)
      {
         LEN=index(substr($2, START), "|");
         if (( LEN > 0 ))
         {
            print $1,substr($2, START, LEN -1);
         }
         else
         {
             print $1,substr($2, START);
         }
         START=START + LEN;
      }
   }
}'

Send the filtered results down a link to the column import. Create an output link to the destination. In the column import stage, set the delimeter to ";" and Quote to none. Also, create the columns in the output that you need based upon the assignment names.

ThilSe · Post by **ThilSe** » Thu Mar 23, 2006 2:44 am

basic wrote: That's why, i think the pivot is the best solution but i don't know what i put in the derivation...???

Hi,

In the derivations write the column names seperated by comma.

if u have 4 columns (Key1,Col1,col2,col3) then
in the output create "Key1" and "CombinedColOp" fields.
In derivation of "CombinedColOp" write the columns names seperated by coma i.e. "Col1,col2,col3"

Regards
Senthil

djm · Post by **djm** » Thu Mar 23, 2006 7:04 pm

A much simpler awk script that that listed above would achieve what you need. Have a look at this thread. There is a reply that I contributed that is very close to what you are after. Tweak the awk script in that and you will have what you need. Suggestion: if you end up with an awk script with more than about 3 lines as a solution for this, you are probably not doing it very efficiently.

David

Ultramundane · Post by **Ultramundane** » Thu Mar 23, 2006 8:42 pm

I will say that this 3 liner below is much easier to read, but is much slower. It takes 1.5 seconds for a 70000 record test I ran vs. the longer version which takes only 1 second. The mulitple delimeters have my head spinning.

Code: Select all

## Author Ryan T. Putnam
awk 'BEGIN {FS=";"; OFS=";";}
{
  TO=split($2, ARRAY, "|");
  if ( TO == 0 ) { print $1 OFS; }
  for (i=1;i<=TO;i++) { print $1,ARRAY[i]; }
}'

I am not sure I coded this with 100% efficiency (as it runs slower than the previous version), but I did want to comply with the 3 line rule.

Ultramundane · Post by **Ultramundane** » Thu Mar 23, 2006 8:59 pm

DJM's solution he provided earilier is much faster. I find it more difficult to read, but it runs in .5 seconds. I think he was being modest by saying 3 lines because his solution will fit on one line very nicely.

DJM's solution modified by Ryan.

Code: Select all

awk 'BEGIN {FS=";"; OFS=";";} { gsub(/|/, "\n" $1 OFS); print $0; }'

djm · Post by **djm** » Thu Mar 23, 2006 11:31 pm

Ahh, but my programming layout style would have laid out your "core" code like so:

Code: Select all

{
  gsub(/|/, "\n" $1 OFS);
  print $0;
}

for maintainability reasons.

I don't know whether that makes mine 2 or 4 lines! By the way, nice implementation of a "leave this as an exercise for the reader" :D

David

DSXchange

Horizontal Pivot

Horizontal Pivot

Re: Horizontal Pivot

Re: Horizontal Pivot