Aggregator showing different count for different runs
Posted: Fri Sep 11, 2009 4:40 am
Hi All,
My job design is as follows:
dataset--->transformer---->aggreagator---->ODBC conector
I have hash partiitoning and perform sort on the aggregator. I have 4 key columns(Not Null) and one column whose sum I am finding out. Somehow I get different count coming from the aggregator stage everytime i run. The job runs on 4 nodes. Could this be because the data is wrong or due to partitioning issues. Actually one of the key columns had some null values and spaces, however we have handled them in the transformer before the aggregator stage and so no nulls are passing. However i fear this is because of any special characters in that key column which might not have been handled.
Any help is appreciated.
My job design is as follows:
dataset--->transformer---->aggreagator---->ODBC conector
I have hash partiitoning and perform sort on the aggregator. I have 4 key columns(Not Null) and one column whose sum I am finding out. Somehow I get different count coming from the aggregator stage everytime i run. The job runs on 4 nodes. Could this be because the data is wrong or due to partitioning issues. Actually one of the key columns had some null values and spaces, however we have handled them in the transformer before the aggregator stage and so no nulls are passing. However i fear this is because of any special characters in that key column which might not have been handled.
Any help is appreciated.