Hi All,
Could some one help in following design of parallel job.
I have the following Columns:
Input
------
A B C D
1 878 001 004
1 999 002 003
2 789 005 004
2 996 003 007
My Output should be:
GroupBy A and Sum(B) And First(C) and First(D)
Output
--------
A B C D
1 1877 001 003
2 1785 003 004
Could some one help how do I achieve this logic in Parallel. We have first function in Server jobs and what would be the relevant of it in Parallel Jobs
Best Regards
Cherry
Aggregation
Moderators: chulett, rschirm, roy
Re: Aggregation
Have you tried using the "Aggregator Stage"?
You can Hash/sort in input of Aggregator Stage on field "A", and then you can specify following properties in the stage ---
Grouping key = A
Aggregation type = calculation
Column for calculation = B
Sum output column = B
Column for calculation = C
Sum output column = C
Column for calculation = D
Sum output column = D
You can Hash/sort in input of Aggregator Stage on field "A", and then you can specify following properties in the stage ---
Grouping key = A
Aggregation type = calculation
Column for calculation = B
Sum output column = B
Column for calculation = C
Sum output column = C
Column for calculation = D
Sum output column = D
Re: Aggregation
Sorry for the copy/paste error ...
for C and D, you should use the following function -
Column for calculation = C
Minimum Value output column = C
Column for calculation = D
Minimum Value output column = D
for C and D, you should use the following function -
Column for calculation = C
Minimum Value output column = C
Column for calculation = D
Minimum Value output column = D