Parallel aggregator
Moderators: chulett, rschirm, roy
Parallel aggregator
Is pipe-lining possible when aggregator stage is used. For instance, I have a job using aggregator, that sums up colD by grouping 3 input columns, ColA, ColB, ColC, so before aggregator start spitting the rows out, it waits for all the input rows to be read.
Is there a way to make aggregator stage producing output even before all the input data is read?
Is there a way to make aggregator stage producing output even before all the input data is read?
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
What Craig said.
The reason this "pipelines" the data is that the Aggregator stage does not have to build a table in memory to accumulate all the results. It can processing a single grouping key at a time, processing all the rows for that group (since the data are sorted by grouping key(s)), and immediately transfer that group result to the output line and free the memory that was being used to hold the results.
Thus Sort mode make minimal use of memory compared to Hash mode. Of course, some of that memory will be consumed sorting the data upstream of the Aggregator stage.
The reason this "pipelines" the data is that the Aggregator stage does not have to build a table in memory to accumulate all the results. It can processing a single grouping key at a time, processing all the rows for that group (since the data are sorted by grouping key(s)), and immediately transfer that group result to the output line and free the memory that was being used to hold the results.
Thus Sort mode make minimal use of memory compared to Hash mode. Of course, some of that memory will be consumed sorting the data upstream of the Aggregator stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
I'm not sure there's a "recommended" so much as a default one but others can address that. The partitioning types need to be something you become very familiar with unless you only ever run your jobs on a single node. The moment you open that up to more than one node, how the data is partitioned can make or break things. Simplest case is it runs longer that it should. Worst case, your output data is wrong or incomplete. Or the job goes boom.
Perhaps this will help.
Perhaps this will help.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Same is recommended iff the upstream stage:
- is executing in parallel mode
is executing on the same nodes
is partitioned using the correct algorithm
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Yes the job is executing on 6 nodes with all the stages running on the same nodes. In this eg, the upstream stage is a transformer stage. The input stage is a Oracle stage where partitioned read is enabled (with default rowid range).
The group is performed on Year(kind of redundant because we only process current year and target inserts into current year partition only), month, id1, id2...Id10. That forms the grain of the target table. So Year is the first grouping key and sorting is in the same as grouping key order.
I did get a significant performance improvement with 'same' partition, aggregating 26 M rows in less than 4 min. But when I specified 'Hash' or 'Modulus', I do not see the pipe-lining and the job takes long time to complete
The group is performed on Year(kind of redundant because we only process current year and target inserts into current year partition only), month, id1, id2...Id10. That forms the grain of the target table. So Year is the first grouping key and sorting is in the same as grouping key order.
I did get a significant performance improvement with 'same' partition, aggregating 26 M rows in less than 4 min. But when I specified 'Hash' or 'Modulus', I do not see the pipe-lining and the job takes long time to complete
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
That would depend on what partitioning is used upstream of the Transformer stage. Your answer suggests an unfamiliarity with what partitioning does and achieves. Is there any explicit partitioning upstream in your job design?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Did you click on the included link? Partitioning is core, key knowledge required for success using the tool. IMHO.chulett wrote:Perhaps this will help.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers