I have the following data:
Code: Select all
Loan_Id Amount Produts_Id Store_Id
1 1000.9 55 12
1 .082 55 12
2 2000 55 32
3 0.8776 55 05
2 9.8 55 32
3 1000 55 05
No Of Nodes: 2
stages used:
Code: Select all
oracle---->agg (group by loan_id, product_id, store_id, sum(amount) partition was set to auto ----> sequential file
now i modified the job with the following:
Code: Select all
oracle---->sort_stage (order by loan_id)--->agg (group by loan_id, product_id, store_id, sum(amount) partition was set to Hash (loan_id, products_id, store_id) ----> sequential file
i do not see any significat improvement in the performance. can somebody help me where im wrong.
This is the only job that is running in the environment, no other jobs or applications are running in the environment.