Is the output of the aggregator stage sorted ?
Posted: Thu Apr 05, 2012 3:48 pm
Hello,
Does anyone know if sorted data is input to aggregator / funnel / join / merge stage, will the output data from these stages be also sorted ?
I have a very basic aggregator job with following details -
Infile - 40 fields
Key - 1 fields
aggr clms - 2 fields
pass through - 37 fields.
to achieve this, I am doing these steps -
1) Sort the infile on "Key" and generating "Key Change column"
2) Passing the sorted data through the copy stage ---
a) Aggregator stage with key_clm and 2 rollup fields
b) Filter stage with Key_clm and 37 pass through fields
3) Filter records with "Key Change column =1" .. thus getting distinct on key_clm
3) Aggregate the 2 fields.
4) sort the output data from aggregator
4) sort teh output data from filter stage
5) join the data using the Key field in Join stage.
Is it really needed to use the sort in step (4) or the aggrgator and filter stage will output the data sorted and hence will not need to be sorted for the "Join" stage.
Please help me understand the outputs from the stages a little better.
Thanks,
Neha
Does anyone know if sorted data is input to aggregator / funnel / join / merge stage, will the output data from these stages be also sorted ?
I have a very basic aggregator job with following details -
Infile - 40 fields
Key - 1 fields
aggr clms - 2 fields
pass through - 37 fields.
to achieve this, I am doing these steps -
1) Sort the infile on "Key" and generating "Key Change column"
2) Passing the sorted data through the copy stage ---
a) Aggregator stage with key_clm and 2 rollup fields
b) Filter stage with Key_clm and 37 pass through fields
3) Filter records with "Key Change column =1" .. thus getting distinct on key_clm
3) Aggregate the 2 fields.
4) sort the output data from aggregator
4) sort teh output data from filter stage
5) join the data using the Key field in Join stage.
Is it really needed to use the sort in step (4) or the aggrgator and filter stage will output the data sorted and hence will not need to be sorted for the "Join" stage.
Please help me understand the outputs from the stages a little better.
Thanks,
Neha