Data claims to already be sorted
Posted: Mon Jul 07, 2008 4:25 am
I have job as below.
DB2------>Aggegator------>Join Stage.......>Dataset
Keys in Aggregator are - Key1, Key2, Key3 and Key4
Key in Join Stage is - Key1
I am using Sort method to aggregate the data and hence Hash/Partition on the key columns is done on the input link of the Aggregator stage.
Now as the key used in join stage is only one (Key1), I am again doing Hash/Sort on Key1 in the input of join stage.
I am getting following warning in the director.
"When checking operator: Data claims to already be sorted on the specified keys the 'sorted' option can be used to confirm this. Data will be resorted as necessary. Performance may improve if this sort is removed from the flow"
When I remove the Hash/Sort in join stage link from aggregator and keep it as Same, its not throwing this warning.
I am confused with the concept of Hash/Sort.
As per my understanding as the keys are different we should againg partition the data on that key.
Please guide if there is any misconception and please elaborate what does the warning mean by 'sorted' option can be used to confirm this
Thanks in Advance
DB2------>Aggegator------>Join Stage.......>Dataset
Keys in Aggregator are - Key1, Key2, Key3 and Key4
Key in Join Stage is - Key1
I am using Sort method to aggregate the data and hence Hash/Partition on the key columns is done on the input link of the Aggregator stage.
Now as the key used in join stage is only one (Key1), I am again doing Hash/Sort on Key1 in the input of join stage.
I am getting following warning in the director.
"When checking operator: Data claims to already be sorted on the specified keys the 'sorted' option can be used to confirm this. Data will be resorted as necessary. Performance may improve if this sort is removed from the flow"
When I remove the Hash/Sort in join stage link from aggregator and keep it as Same, its not throwing this warning.
I am confused with the concept of Hash/Sort.
As per my understanding as the keys are different we should againg partition the data on that key.
Please guide if there is any misconception and please elaborate what does the warning mean by 'sorted' option can be used to confirm this
Thanks in Advance