Remove Duplicate using hash
Posted: Wed Mar 28, 2012 2:21 am
Hi All,
I have 18 lakhs duplicate records on GRP column and in that 9 distinct records are there.I am using remove duplicate stage and i have explicityl specifited hash on key col and not sort.Its working fine.
My confusion is according to document,Remove duplicate stage need sorted data and hashed partitioned on key column,but in my case its only hased.
I need your input on how it is working.
Thanks
I have 18 lakhs duplicate records on GRP column and in that 9 distinct records are there.I am using remove duplicate stage and i have explicityl specifited hash on key col and not sort.Its working fine.
My confusion is according to document,Remove duplicate stage need sorted data and hashed partitioned on key column,but in my case its only hased.
I need your input on how it is working.
Thanks