Page 1 of 1

Aggregator issue

Posted: Thu Jun 14, 2007 12:26 pm
by har
Folks,
i'm processing million records through sort and aggregator stage.
I did sort on 4 keys in ascending order in sort stage and Group by 4 keys and hash partitoned on 4 keys in Aggregator stage.
i'm using sort method in Aggregator.
If i ran job 4 times with same input dataset and i see 4 different amount of records are coming from aggregator.
Any idea ..?

Re: Aggregator issue

Posted: Thu Jun 14, 2007 2:09 pm
by gateleys
Hmmm!! Why don't you just try out with, may be 1,000 input rows to see if it leads you anywhere.

Or, is it that you are not providing us with other job details?

gateleys

Posted: Thu Jun 14, 2007 2:31 pm
by thebird
Make sure that the order of the sortiung key, grouping keys and the hash partitioning keys are the same.

Do you get any warnings in the log?

Re: Aggregator issue

Posted: Thu Jun 14, 2007 2:35 pm
by gateleys
Did you key in any parameter values during each run?

Posted: Fri Jun 15, 2007 6:12 am
by har
Sorting keys and Partitioning Keys are in same order only.
Say for ex if i ran for only specific key then i get excat records and countts also match.if i ran for a month(million records) then i see difference.