simple question reg aggregator
Moderators: chulett, rschirm, roy
simple question reg aggregator
Hi,
I could not find any simple solution for this simple task.
I just need to group on a deptno and to find a max of sal for that group.
And to fetch the correspoding record for that max(sal).
Obviously to find the max i used Aggregator (Can also be done without aggregator, But...)
Inorder to get the output from the Aggregator i need to do any one of group by function for all rest of the columns.
But i need not to....
Should i have to pass the columns that need to be Aggregated to Aggregaotor and the rest of the col to a merge to get the desired output.
May i know What is in practice......
regards
kumar
I could not find any simple solution for this simple task.
I just need to group on a deptno and to find a max of sal for that group.
And to fetch the correspoding record for that max(sal).
Obviously to find the max i used Aggregator (Can also be done without aggregator, But...)
Inorder to get the output from the Aggregator i need to do any one of group by function for all rest of the columns.
But i need not to....
Should i have to pass the columns that need to be Aggregated to Aggregaotor and the rest of the col to a merge to get the desired output.
May i know What is in practice......
regards
kumar
-
- Participant
- Posts: 232
- Joined: Sat May 07, 2005 2:49 pm
- Location: USA
Hi Kumar,
Contrary to your subject for this post "simple question reg aggregator". Your second half of the post is quite confusing.
Anyway, from what I have understood about your problem, here is my take on it.
Write the output to a hashed file and then use the key columns in the hashed file with your input data and find the input record for that max(sal) criteria.
Contrary to your subject for this post "simple question reg aggregator". Your second half of the post is quite confusing.
Anyway, from what I have understood about your problem, here is my take on it.
In the aggregator stage, group by the deptno and do a max(sal) for that group. Along, with these 2 cols, also pass the key columns to the aggregator stage and just pass them through without any tranformation.I could not find any simple solution for this simple task.
I just need to group on a deptno and to find a max of sal for that group.
And to fetch the correspoding record for that max(sal).
Write the output to a hashed file and then use the key columns in the hashed file with your input data and find the input record for that max(sal) criteria.
Thanks,
Naveen
Naveen
Its my turn to get confused :D
let me put this way,
ip
juz need to grouped by account number and found out max of sal
so the output should be
regards
kumar
let me put this way,
ip
Code: Select all
empno name sal deptno
1111 aaaa 100 10
2222 bbbb 200 20
3333 cccc 300 10
5555 dddd 500 30
4444 eeee 600 20
so the output should be
Code: Select all
empno name sal deptno
3333 cccc 300 10
4444 eeee 600 20
5555 dddd 500 30
i dont know how to pass a col without doing any aggregation in the aggregator stagepass the key columns to the aggregator stage and just pass them through without any tranformation.
regards
kumar
-
- Participant
- Posts: 232
- Joined: Sat May 07, 2005 2:49 pm
- Location: USA
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The best way to solve this kind of problem is to figure out how you would do it if you only had SQL.
In your case this will convince you that it can't be done in one pass. In SQL you would need a nested query.
So too in DataStage. The "nested query" populates a lookup of some kind with the maximum value and corresponding search key. The main pass compares the search key and value against the lookup containing search key and maximum value (both as keys), and imposes the rule "drop" if the lookup fails.
In your case this will convince you that it can't be done in one pass. In SQL you would need a nested query.
So too in DataStage. The "nested query" populates a lookup of some kind with the maximum value and corresponding search key. The main pass compares the search key and value against the lookup containing search key and maximum value (both as keys), and imposes the rule "drop" if the lookup fails.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Hi,
If I understand your problem, I think you have a very better way to do this.
First step : partition by "deptno"
Second step : sort by "sal" with the sort stage (descending mode)
Last step : use the stage which permit to have unique record on a key (I don't remember its name). Choose "deptno" as key and keep the first record.
That's it.
It will be very faster.
Matthieu
If I understand your problem, I think you have a very better way to do this.
First step : partition by "deptno"
Second step : sort by "sal" with the sort stage (descending mode)
Last step : use the stage which permit to have unique record on a key (I don't remember its name). Choose "deptno" as key and keep the first record.
That's it.
It will be very faster.
Matthieu