Hi,
I have a flat file which has data in the following format
Dept Emp Sal
10 50 1000
10 50 2000
10 50 2000
10 20 3000
10 20 3000
10 20 4000
I am interested in picking up the employee , where sum(sal) is the highest.In the above case the expected result is
Dept Emp Sal
10 20 10000
This is how i am doing it
1) Aggregator Transformation -- Sum Sal group by Dept and Emp
Results
Dept Emp Sal
10 50 5000
10 20 10000
2) Sort Transformation - Sort the data based on sal
Results
Dept Emp Sal
10 20 10000
10 50 5000
3) Aggregator Transformation - Pick up the first record
Results
Dept Emp Sal
10 20 10000
Is there a better way of acheiving the same results .
Please help?
Sam
Sum , Sort and Get Results
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 20
- Joined: Thu Oct 06, 2005 12:23 pm
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Two Aggregator stages. The first generates the sum for each Dept/Emp the second generates the max of that. Sort the input file by Dept and Emp (use a filter in the Sequential File stage) and assert this sorted order in both Aggregator stages; this will make the job finish faster.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
No, not really. What's stopping you from loading it into a work table in your database of choice and working it over with a SQL stick?samit_9999 wrote:I just receive a flat file which has data in the above format...so i need to carry the above operations in datastage.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 12
- Joined: Wed Mar 02, 2005 1:59 am
If it ireal time example then how one employee can have more then 1 salary?
Last edited by Mandar_Deshpande on Sun Aug 06, 2006 11:24 pm, edited 1 time in total.
It is not Max(Sal) but it is Max(Sum(Sal)), so we need it.Mandar_Deshpande wrote:Hey,
I dont understand why we are using aggregator in between. what I see from example is data is at emp & dept level. then why we are agregating group by emp & dept. (as it is already aggragated)
Just sort it descending by salary & get the first row
Let me know if I am wrong.
- Mandar
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'