AGGR inconsistency running PX 7.5.1

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ds_is_fun
Premium Member
Premium Member
Posts: 194
Joined: Fri Jan 07, 2005 12:00 pm

AGGR inconsistency running PX 7.5.1

Post by ds_is_fun »

I am sorting by 36 columns, grouping the same 36 columns, partitioning by 3 of the sorting columns i.e 3 columns (sort+parition).
Performing sum calculation on 7 other columns.
I have not checked the key on any columns in the column window.
Each time I run I have different no. of rows outputted from the aggregator stage.
Pl. help. How could I have the same no. of rows outputted everytime I run.
I am using 8 CPUS.
The same problem is giving me different values in the sum columns when I total them up in an excel sheet. Ofcourse because each time difft no. of rows are being outputted.
Thanks.
ds_is_fun
Premium Member
Premium Member
Posts: 194
Joined: Fri Jan 07, 2005 12:00 pm

Post by ds_is_fun »

I am using Sort method in the AGGR stage.
DO I need to use a SORT stage before the AGGR stage and sort all the columns that I am grouping by in the AGGR stage? Is it permissible to sort all the columns in the AGGR stage itself that I am also grouping by.
Thanks
bgs
Participant
Posts: 22
Joined: Sat Feb 05, 2005 9:43 pm

Post by bgs »

the output of the aggregator depends on the partition you are using and the number of nodes.You can try running the agrregator in sequential mode or if you run it in parallel then use hash partition.
ds_is_fun
Premium Member
Premium Member
Posts: 194
Joined: Fri Jan 07, 2005 12:00 pm

Post by ds_is_fun »

DS : Use hash mode for a relatively small number of groups; generally, fewer than about 1000 groups per megabyte of memory. Sort mode requires the input data set to have been partition sorted with all of the grouping keys specified as hashing and sorting keys.

Im currently using Sort mode since I have large no. of rows. and hence would have a bigger no in each group.
Post Reply