Aggregator: Multiple column aggregation into one

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
vijayrc
Participant
Posts: 197
Joined: Sun Apr 02, 2006 10:31 am
Location: NJ

Aggregator: Multiple column aggregation into one

Post by vijayrc »

Hi,

For the same group key in an aggregator, would it be possible to aggregate multiple columns in a record into one output column.

e.g Input
Groupkey1,Groupkey2,Groupkey3.....Amt1,Amt2,Amt3....
1111,2222,3333.....$10,$5,$3
1111,2222,3333.....$15,$3,$2
2222,3333,4444.....$1,$2,$3
2222,3333,4444.....$2,$3,$4

Ouput:
1111,2222,3333.....$38 [10+5+3+15+3+2]
2222,3333,4444.....$15 [1+2+3+2+3+$]

Is this possible ?
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Ofcourse its possible. First do a normal aggregation with the aggregator grouped on the keys. This will leave you with all unique combinations of the keys. Then you can add the columns in the transformer itself into just one column. There you go. Its done.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Though Aggregator, youc cannot Sum accross Columns. As suggested use Transformer for that case.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

This one really needs Aggregator ---> Transformer ---> Aggregator (though the accumulations of the grouped values could be achieved in stage variables in a Transformer stage rather than in the first Aggregator stage).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Ray, i get the first aggregator which is to do normal aggregations based on the grouped keys.
I get the transformer to get the aggregations of the columns across. But whats the purpose of the last aggregator? :roll:
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Well spotted, it's not necessary. I was solving a different problem.

The aggregator stage will generate:
Groupkey1,Groupkey2,Groupkey3.....Amt1,Amt2,Amt3....
1111,2222,3333.....$20,$8,$5
2222,3333,4444.....$3,$5,$7


The Transformer stage (adding the columns) will generate:
Groupkey1,Groupkey2,Groupkey3.....AmtTot
1111,2222,3333.....$38
2222,3333,4444.....$15

Make sure that the data are sorted and partitioned on GroupKey1, GroupKey2, GroupKey3 so that all members of each group are on the same processing node.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vijayrc
Participant
Posts: 197
Joined: Sun Apr 02, 2006 10:31 am
Location: NJ

Post by vijayrc »

ray.wurlod wrote:Well spotted, it's not necessary. I was solving a different problem.

The aggregator stage will generate:
Groupkey1,Groupkey2,Groupkey3.....Amt1,Amt2,Amt3....
1111,2222,3333.....$20,$8,$5
2222 ...
Thanks all for suggestions, [though some answers, I can't see in full]. But my question is Would it be possible with Single Aggregator + anything besides Transformer [costly ?] or with Aggregator back-to-back ?
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

As pointed out, Aggregator cannot aggregate accross columns. For that you need a transformer. The only other option that i can think of right now is to do it in the sql itself if your source is a database by writing your own sql.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You could sum across columns in the Extraction SQL thereby avoiding the need for a Transformer stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

You may find some or other post to deny that transformer cuases Performance degradation. And more over, otherthan adding up this in the source database stage, Transformer will be easieast way to accomplish it.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Tell them Kumar man. :twisted:
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Post Reply