Using RCP in a job that has multilpe join stages
Posted: Mon Oct 17, 2011 7:19 am
Hello,
There is a parallel job that has multiple (4-5) join stages. The join keys are about 10 in each stage but the input into the join stages have anywhere between 70-120 columns. I am looking at ways to decrease the total run time for the job. I am wondering if using RCP will help me in this process. I have come across posts that say RCP does not work well with joins and other posts that say RCP does not decrease the total run time of the job. Any thoughts in helping me decide for/against RCP is greatly appreciated.
Thanks.
There is a parallel job that has multiple (4-5) join stages. The join keys are about 10 in each stage but the input into the join stages have anywhere between 70-120 columns. I am looking at ways to decrease the total run time for the job. I am wondering if using RCP will help me in this process. I have come across posts that say RCP does not work well with joins and other posts that say RCP does not decrease the total run time of the job. Any thoughts in helping me decide for/against RCP is greatly appreciated.
Thanks.