Hi !
I need perfom the followwing task :
Column1 Column2 Column3
1 1 3
1 1 3
2 3 4
2 3 4
2 3 4
I need to get :
1 1 1
2 3 1
the last number is the number of disctinct values of column 3 group byed according column 1 and column 2.
Can I do it with aggregator ???
Distinct Count With Aggregator
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I don't believe there is a "count distinct" aggregate function in the server job Aggregator stage. You could, of course, group by all three and simply count col3 then pass those results through a second Aggregator stage to group by col1 and col2 and to sum the summedcol3 generated by the first Aggregator stage.
Or you could work with stage variables in a Transformer stage (and sorted input) to achieve the same result.
Or you could work with stage variables in a Transformer stage (and sorted input) to achieve the same result.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Yes. This can be done using aggregator stage but your input should be loaded first into hash file and then aggregator then do the count(3rd column) group by 1,2. While loading into hash file, define all three columns as key and read from hash file and change the keys (key would be the first two columns) and that will give you the results as expected.
HTWH.
Regards
Saravanan
HTWH.
Regards
Saravanan