find average salary

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
harikumar
Participant
Posts: 33
Joined: Wed Apr 21, 2010 9:19 pm
Location: banglore

find average salary

Post by harikumar »

i have source table like
empno,salary
10,1000
20,4000
30,5000
10,3000
30,5000
40,6000

i want to do find who are the employees having salary more than average salary

cal anybody tell how can we solve this using stages
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

Pick a data connector.
Write the custom SQL query to retrieve the relevant records.
Do something with those records.
Go home early: you've earned it.

:wink:
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You need two passes through the data - just like you would in SQL (an outer query and an inner query). In the parallel world that will involve two jobs, since blocking operations are not permitted. The first job gets the average salary - probably using an Aggregator stage - and stores that somewhere that the controlling sequence can pick up and supply to the second job as a parameter value. The second job filters your data records according to requirement, using the parameter value as the baseline average salary.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

You requirement falls into reporting category than ETL.

Especially if you append the data the averages will change.

Use a reporting tool to achieve your goal.
grimm336
Participant
Posts: 12
Joined: Thu Jun 25, 2009 10:19 am

Post by grimm336 »

Use the sql in your database stage....
select sal from employee where sal>avg(sal)

And link the out put to a flat file which is having same meta data as the table...

cheers
grimm
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

My guess is that that SQL will throw a syntax error of the form "avg function used with no grouping column". This has to be a nested query.

Code: Select all

select sal from employee where sal > (select avg(sal) from employee)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

In other words, two passes through the data. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
laknar
Participant
Posts: 162
Joined: Thu Apr 26, 2007 5:59 am
Location: Chennai

Post by laknar »

Below is the Query you are looking for.

Code: Select all

SELECT EMPNO,SALARY FROM TABLENAME WHERE SALARY>(SELECT AVG(SALARY) FROM TABLENAME) 
Regards
LakNar
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That looks vaguely familiar yet, at the same time, somehow wrong.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply