Aggregator Stage Warning.
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 614
- Joined: Fri Feb 06, 2004 3:59 pm
Aggregator Stage Warning.
Hi All ,
I am getting the following type warnings for the aggregator stage ..
Aggregator_33,3: Hash table has grown to 16384 entries.
what does that mean?
Any inputs greatly appreciated.
Thank you.
I am getting the following type warnings for the aggregator stage ..
Aggregator_33,3: Hash table has grown to 16384 entries.
what does that mean?
Any inputs greatly appreciated.
Thank you.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
It means that your hash table has grown to 16384 entries. That is, 16384 different combinations of grouping column values when HASH method is being used. It's meant to alert you to the fact that you're consuming a certain amount of memory, and might therefore consider switching to SORT method. Nothing is broken, but if you end up having very many more combinations of grouping column values, then things might start breaking.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Re: Aggregator Stage Warning.
This has been discussed so many times in this forum.
Please use search option .
Please use search option .
pandeeswaran
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Depends what you mean by "adverse effects" I guess.
You should get identical results using either method. With sorted data you'll get them faster through the Aggregator stage, but much of that gain may be taken up sorting the data.
You should get identical results using either method. With sorted data you'll get them faster through the Aggregator stage, but much of that gain may be taken up sorting the data.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Thanks Ray!!
Since we are getting "Hash table has grown upto 16384 entries" warning some times, then what's the necessity of using hash method?
As per my understanding we can go with sort method always, provided records are sorted previously before aggregator stage.
In which occasions using hash method will be a best practice?
Since we are getting "Hash table has grown upto 16384 entries" warning some times, then what's the necessity of using hash method?
As per my understanding we can go with sort method always, provided records are sorted previously before aggregator stage.
In which occasions using hash method will be a best practice?
pandeeswaran
As stated in the Parallel Job Developer Guide, Aggregator's hash method is typically used for a relatively small number of groups (distinct key values). The more groups in your data, the more memory will be used by the operator as it builds it's hash table of group values and calculations.
Hash method allows you to not need to sort your data prior to entering the Aggregator (although you should still partition it). For a large number of groups, pre-sorting the data and using the Sort method in Aggregator may likely be more efficient.
Regards,
Hash method allows you to not need to sort your data prior to entering the Aggregator (although you should still partition it). For a large number of groups, pre-sorting the data and using the Sort method in Aggregator may likely be more efficient.
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
-
- Premium Member
- Posts: 614
- Joined: Fri Feb 06, 2004 3:59 pm
Hi All ,
Even after sorting and partitioning the data on key field (group field)
Method = Sort in aggregator stage is causing below warnings..and the input file big.
Not sure what I am missing?
Aggregator_33: When checking operator: User inserted sort "Sort_22" does not fulfill the sort requirements of the downstream operator "APT_SortedGroup2Operator in Aggregator_33"
Can somebody clarify?
Appreciate your inputs.
Even after sorting and partitioning the data on key field (group field)
Method = Sort in aggregator stage is causing below warnings..and the input file big.
Not sure what I am missing?
Aggregator_33: When checking operator: User inserted sort "Sort_22" does not fulfill the sort requirements of the downstream operator "APT_SortedGroup2Operator in Aggregator_33"
Can somebody clarify?
Appreciate your inputs.
-
- Premium Member
- Posts: 497
- Joined: Sun Dec 17, 2006 11:52 pm
- Location: Kolkata
- Contact: