Partition,Sort,Group by in Aggregator Stage
Posted: Wed Apr 20, 2005 3:01 pm
Im confised about the way AGGR works. Lets assume we want to sort, group by and sum in PX with 2 CPUs.
Say, we have a file
Ename Empno Deptno Sal Mgr
John 100 10 1000 102
Smith 101 20 2000 102
Eric 102 10 3000 102
Raj 103 10 2000 101
Tom 104 30 2000 101
Dan 105 30 1000 101
Drew 106 20 2000 102
We partition by "Mgr" and choose Sort partition mode for something large just like above.
We group by "Deptno".
We also need to calculate the Sum of Sal.
So based on the above.
All rows that have Mgr = 101 go to CPU1 for processing.
rows that have Mgr = 102 go to CPU2 for processing.
Is the sum calculation down now? or is each data set then joined all over to do
group by from each processor and then summed by.
Pl. explain.
Thanks
Say, we have a file
Ename Empno Deptno Sal Mgr
John 100 10 1000 102
Smith 101 20 2000 102
Eric 102 10 3000 102
Raj 103 10 2000 101
Tom 104 30 2000 101
Dan 105 30 1000 101
Drew 106 20 2000 102
We partition by "Mgr" and choose Sort partition mode for something large just like above.
We group by "Deptno".
We also need to calculate the Sum of Sal.
So based on the above.
All rows that have Mgr = 101 go to CPU1 for processing.
rows that have Mgr = 102 go to CPU2 for processing.
Is the sum calculation down now? or is each data set then joined all over to do
group by from each processor and then summed by.
Pl. explain.
Thanks