How to realize opposite of Pivot?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
daniel0623
Charter Member
Charter Member
Posts: 34
Joined: Tue May 31, 2005 8:17 pm
Location: ShangHai,China

How to realize opposite of Pivot?

Post by daniel0623 »

Hi all,
For example , How to convert soure to target?
Source data:
key column
A 123
A 456
A 789
B 111
B 222
B 333
C 444
C 555
C 666

Target data:
key column1 column2 column3
A 123 456 789
B 111 222 333
C 444 555 666

Thanks!
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are there always three records per group, or is the number variable? This will affect the solution.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
daniel0623
Charter Member
Charter Member
Posts: 34
Joined: Tue May 31, 2005 8:17 pm
Location: ShangHai,China

Post by daniel0623 »

Yes , Target data always has same columns.
dennis77
Participant
Posts: 4
Joined: Tue Dec 16, 2003 9:13 pm

Post by dennis77 »

Hi,

There is no straight forward logic to your question i believe , but i can give a suggestion where u can achieve it through data stage.

1. Assuming you are having the data in a source table/File . Define Key column as key from the source stage.

2. Have another temporavary table with 2 columns . Say for example Key column , Values .
Have the value column as Varchar(1000) or Have the column length with the maximum length you would like to have.

3. Develop a job with Source table/File ,put a copy stage and a intermediate database stage (example DB2/UDB Stage or Oracle enterprise stage). Giving the temporavary table in the stage using the User-define Sql option for the stage.
In the Database write the following queries ,
insert into temp (key,values) value (Key,value) from the copy stage.
Update temp
Set value = value + ',' + column value from copystage
where key = keyvalue from copy stage .

Note 1:- Here i am concantenating the values into a single column ,separating by comma value and then store the data in the value column.

Note 2:- According to this logic first insert statement will work then if the key value is found in the table , it will start updating the value column based on the key value. All your values will be concatenated using the comma separator.


4. After updation/insertion from the temporavary table select the records using the link and pass it to a column import stage . In the column import stage you can split the value column in the temporary table in to multiple columns (as you said co1,col2,col3) stating the coma separator in the column import stage.

5. Have a target stage with insert option pass the key column and separated column from the column import stage output .

good luck , dennison , Wipro technologies,India
elavenil
Premium Member
Premium Member
Posts: 467
Joined: Thu Jan 31, 2002 10:20 pm
Location: Singapore

Post by elavenil »

If source will have only 3 records for a key column, then you can achieve this with the following method.

In the transformer, create n no of files (n=no of rows fro each key column), and the file metadata would be 4 columns (1st col is key and remainings are the data.) First file, load 1st column and leave the remaining the two cols blank, 2nd file, populate key col and 2nd col and 3rd file, populate key and 3rd col. Merge all the three files and use aggregator to aggregate by key so that you would get the expected output.

Regards
Saravanan
lshort
Premium Member
Premium Member
Posts: 139
Joined: Tue Oct 29, 2002 11:40 am
Location: Toronto

Post by lshort »

elavenil,

Datastage reads a row at a tiime. With your solution how do you know when to write to second and third column?
Lance Short
"infinite diversity in infinite combinations"
***
"The absence of evidence is not evidence of absence."
sun rays
Charter Member
Charter Member
Posts: 57
Joined: Wed Jun 08, 2005 3:35 pm
Location: Denver, CO

Post by sun rays »

Saravanan's method could work, if you take the inrow-num and apply mod function with 3 as divisor and if get zero as a result then it goes to the third file, else apply mod with 2 as divisor and if get zero then it goes to second file, else it goes to first file. By this we could know which is first row, second, third, fourth and so on.
alanwms
Charter Member
Charter Member
Posts: 28
Joined: Wed Feb 26, 2003 2:51 pm
Location: Atlanta/UK

Post by alanwms »

I've done this before, but with variable number of rows per unique key column. First the key columns must be sorted. Next, an empty row must be added to the end of the input data. I use an after-stage ExecSH routine to append the extra line. This line will be used as your final break.

You'll need to use a set of stage variables to store the "current" and "previous" values of each column: Key, col1, col2, col3. When the Key column is different, you'll output a row with the output columns containing the "previous" values.

You'll need to assign the "previous" value stage variables with the contents of the "current" value stage variables before you assign the proper "current" column with the single incoming column data.

This explanation is probably hard to follow without code snippets, so I'll work on those snippets for a later post.

Alan
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Sort (and hash partition) the data set based on the key column if it's not already thus sorted.

Load up three separate stage variables depending on MOD(key,3) value. Use these values in the three separate non-key columns in the output of a Transformer stage.

Run the result through an Aggregator stage grouping on Key and using Last as the aggregate function.

Or constrain the Transformer stage output to require that all three columns are not empty. But this requires a technique to reset the stage variables to "" on a change of key value.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
lshort
Premium Member
Premium Member
Posts: 139
Joined: Tue Oct 29, 2002 11:40 am
Location: Toronto

Post by lshort »

Hmmm.
Never used this technique before. Definately worth trying. 8)
Lance Short
"infinite diversity in infinite combinations"
***
"The absence of evidence is not evidence of absence."
smitageorge
Participant
Posts: 37
Joined: Fri Sep 30, 2005 10:05 am
Location: va

Post by smitageorge »

alanwms wrote: This explanation is probably hard to follow without code snippets, so I'll work on those snippets for a later post.
Alan
Hi Alan, can you pass on the snippets,so that this group will get the benefit of this.

Thanks
smita
Post Reply