Aggregator stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
DS1
Charter Member
Charter Member
Posts: 29
Joined: Wed Mar 29, 2006 1:13 pm

Aggregator stage

Post by DS1 »

Hello everybody,

I am using 7.1 parallel aggregator stage.
There are aaround 30 columns passing through it of which grouping needs to be done on 17 columns and '1' column needs to be summed up. however the rest of the columns (number as well as char) need to be passed just as it is taking first/last value.

My problem is how do I get these remainign columns to pass by as it is. the aggregator stage doenot carry these columns without derivations for them. however giving derivation as max / min alos would not help as these columns are char.

can anyone please help me with this problem.

Thanks
kris007
Charter Member
Charter Member
Posts: 1102
Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI

Post by kris007 »

Sort the data going into your aggregator stage and then define the derivations for the rest of the columns as first/last based upon your requirement. Also, don't forget to mention in the Aggregator, how the columns are coming in i.e. the sort order of the input columns.

HTH
Kris

Where's the "Any" key?-Homer Simpson
vinaymanchinila
Premium Member
Premium Member
Posts: 353
Joined: Wed Apr 06, 2005 8:45 am

Post by vinaymanchinila »

Kris,
I didnt get it either, he is asking about what to do with the rest of the columns, like how to propogate them to the output link without using the derivations.
Thanks,
Vinay
DS1
Charter Member
Charter Member
Posts: 29
Joined: Wed Mar 29, 2006 1:13 pm

Post by DS1 »

yes , like Vinay mentioned my problem is different,,

I have the columns sorted and it works well if i take only the grouping columns and the column to be summed. the aggregation is just fine... the problem is how to propogate the columns that are not a part of grouping key as well as need no agg. to be performed on them.
kris007
Charter Member
Charter Member
Posts: 1102
Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI

Post by kris007 »

When you are grouping the data by seventeen columns, you are going to get all distinct rows, based on the combination of the seventeen columns. Now, one of the columns is summed up, the derivation is SUM. That's fine till there. Now, for the rest of the columns you are going to get a single value( as you are grouping by). Now you have to know what this value has to be..does it have to be the first or last based upon a date or Max or Min or something like that. and hence the sorting comes into picture. If I still don't make sense, may be an input dataset example from OP might help to explain in a better way.
Kris

Where's the "Any" key?-Homer Simpson
DS1
Charter Member
Charter Member
Posts: 29
Joined: Wed Mar 29, 2006 1:13 pm

Post by DS1 »

yes, what you say makes full sense, and yes the other values have to be given a single value, lets say the last value. now how do I do that in the aggregator stage?
using the max function in agg. stage doesnot help as they are char fields. I am not able to see any option like last / first in the aggregator stage or any option that will help me propogate these columns.

Can you please help me on how to propogate these columns ??? (yes with one value, say, last )

Thanks
kris007
Charter Member
Charter Member
Posts: 1102
Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI

Post by kris007 »

Ok. I get it. But, did you try using the Max and Min functions in the stage even if your datatype is CHAR. Try it out once and see what happens.
Kris

Where's the "Any" key?-Homer Simpson
DS1
Charter Member
Charter Member
Posts: 29
Joined: Wed Mar 29, 2006 1:13 pm

Post by DS1 »

have tried using max and min for rest of the columns,, the job gets aborted .
kris007
Charter Member
Charter Member
Posts: 1102
Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI

Post by kris007 »

DS1 wrote:have tried using max and min for rest of the columns,, the job gets aborted .
What message does it give when it aborts?
Kris

Where's the "Any" key?-Homer Simpson
nick.bond
Charter Member
Charter Member
Posts: 230
Joined: Thu Jan 15, 2004 12:00 pm
Location: London

Post by nick.bond »

I think there is a problem with the EE aggregator stage when trying to pass char fields through. I think it has been raised with Ascential and they will fix it for the next release. We have used either Transformer stages with bespoke aggregation logic or you can also put a Server aggregator into a shared container and use the shared container in your EE job but i would only use the later if it is not the main flow of data being aggregated because the performance hit will be very large.
Regards,

Nick.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

The same as been discussed here. But I belive there is a patch already released by IBM for passing string across aggregator stage. Search for more information in this forum.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Post Reply