Improve performance of Join of Data Sets
Posted: Fri Mar 14, 2014 1:44 am
Hi All,
Even i am also facing the same issue as seen in this post.
I have two data set as input to join stage and its extracting only 276 row/sec.
below are the details
Data Set 1
-----------
Total Records: 19199366
Total 32k Blocks: 19141
Total Bytes: 2430032178
Node Records blocks Bytes
Node1 9593377 9566 1214500680
Node2 9605989 9575 1215531492
Data Set 2
-----------
Total Records: 19199355
Total 32k Blocks: 23812
Total Bytes: 3041367308
Node Records blocks Bytes
Node1 9597492 11903 1520355820
Node2 9601863 11909 1521011488
Please suggest what can be done to improve performance .
I also added the two env variable as suggested by Ravi keeping default value but didnt get any help.
Even i am also facing the same issue as seen in this post.
I have two data set as input to join stage and its extracting only 276 row/sec.
below are the details
Data Set 1
-----------
Total Records: 19199366
Total 32k Blocks: 19141
Total Bytes: 2430032178
Node Records blocks Bytes
Node1 9593377 9566 1214500680
Node2 9605989 9575 1215531492
Data Set 2
-----------
Total Records: 19199355
Total 32k Blocks: 23812
Total Bytes: 3041367308
Node Records blocks Bytes
Node1 9597492 11903 1520355820
Node2 9601863 11909 1521011488
Please suggest what can be done to improve performance .
I also added the two env variable as suggested by Ravi keeping default value but didnt get any help.