Page 1 of 1

how to get TOP 5 sums

Posted: Wed Jun 01, 2011 7:15 am
by Devendrudu
Hi Friends,

i have two files
f1 have cust_id,Tran_id,sales_amount

f2 have cust_id,cust_name

i want o/p like cust_name, Total_sales_amount

I want only top 5 Total_sales_amount customers.


how can do this?

Posted: Wed Jun 01, 2011 7:26 am
by chulett
Join the two, sort the result by the amount descending and constrain the output to the first five records. Note you'll need to run sequentially or on one node for this to work properly.

Posted: Wed Jun 01, 2011 12:42 pm
by Devendrudu
if it run in 2node or 4node how will get top 5 sums

Posted: Wed Jun 01, 2011 1:02 pm
by soumya5891
In the sort stage you need to perform a hash partioning properly

Posted: Wed Jun 01, 2011 1:02 pm
by chulett
You'll get the Top 5 sums per node unless you throttle things down to sequential execution, hence my suggestion.

Posted: Wed Jun 01, 2011 1:07 pm
by Devendrudu
which key i want to hash partition.

Posted: Wed Jun 01, 2011 1:16 pm
by DSguru2B
The answer lies in this statement
Devendrudu wrote: I want only top 5 Total_sales_amount customers.