AGGR inconsistency running PX 7.5.1

ds_is_fun · Post by **ds_is_fun** » Wed Apr 20, 2005 6:55 am

I am sorting by 36 columns, grouping the same 36 columns, partitioning by 3 of the sorting columns i.e 3 columns (sort+parition).
Performing sum calculation on 7 other columns.
I have not checked the key on any columns in the column window.
Each time I run I have different no. of rows outputted from the aggregator stage.
Pl. help. How could I have the same no. of rows outputted everytime I run.
I am using 8 CPUS.
The same problem is giving me different values in the sum columns when I total them up in an excel sheet. Ofcourse because each time difft no. of rows are being outputted.
Thanks.

ds_is_fun · Post by **ds_is_fun** » Wed Apr 20, 2005 7:00 am

I am using Sort method in the AGGR stage.
DO I need to use a SORT stage before the AGGR stage and sort all the columns that I am grouping by in the AGGR stage? Is it permissible to sort all the columns in the AGGR stage itself that I am also grouping by.
Thanks

bgs · Post by **bgs** » Wed Apr 20, 2005 1:59 pm

the output of the aggregator depends on the partition you are using and the number of nodes.You can try running the agrregator in sequential mode or if you run it in parallel then use hash partition.

ds_is_fun · Post by **ds_is_fun** » Wed Apr 20, 2005 2:44 pm

DS : Use hash mode for a relatively small number of groups; generally, fewer than about 1000 groups per megabyte of memory. Sort mode requires the input data set to have been partition sorted with all of the grouping keys specified as hashing and sorting keys.

Im currently using Sort mode since I have large no. of rows. and hence would have a bigger no in each group.