AGGREGATOR: Performance Suggestion

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
vijayrc
Participant
Posts: 197
Joined: Sun Apr 02, 2006 10:31 am
Location: NJ

AGGREGATOR: Performance Suggestion

Post by vijayrc »

Hi,

I have a file with 104 Columns tallying closely 1100bytes.
I have numerous links of SORT,AGGREGATORS,COL GEN to Aggregate different
TOTALs [total of 50 totals] based on a combination of Fields[
usually from a combination of 3-10fields] and FUNNEL into one final Output
thru all these links.

Today I pass all 104 columns via SORT, SORTing only the columns
that are GROUPED by, in the succeeding AGGREGATOR, but passing all the
columns thru the SORT onto the Aggregator which used only those GROUP BY and
AGGREGATE BY columns.

Is it worthwhile to change my jobs to just limit the Input to the SORT to be
the fields needed for the AGGREGATOR ???

I have 50 such links for different totals, and before doing a change, thought would
get some opinion on this.

Thanks again
Vijay
mansoor_nb
Participant
Posts: 48
Joined: Wed Jun 01, 2005 7:10 am

Post by mansoor_nb »

Yes it is better you propogate those columns to the sort stage which are required for AGGREGATOR.
Post Reply