Page 1 of 1

To Get the First Duplicate Record from HashFile Output

Posted: Wed Jun 15, 2005 6:37 am
by tombastian
Hi All,
I have a Hash File Stage which has few duplicate key records going in and as HashFile Stage works, I am getting last input duplicate key record as output. Is there a way to get the first record among the duplicates as the output. I am using a sort stage and and a surrogate key to get the first one in the output but would like to know whether there is a better option using some functionality of Hashfile stage itself.


Input to Hash File

Col1(Key Field in HF) Col2 Col3
100 ABC C99
100 RXZ G77
100 JKL G77
115 XYZ R33

Normal Output

Col1(Key Field in HF) Col2 Col3
100 JKL G77
115 XYZ R33

Required Output

Col1(Key Field in HF) Col2 Col3
100 ABC C99
115 XYZ R33

Thanks in Advance,
Tom.

Posted: Wed Jun 15, 2005 6:45 am
by chulett
If you want the 'first' rather than the 'last', you need to sort input on your key fields in a descending order rather than ascending. Then all you'll have in the hash when you are done are the first (lowest) values for any duplicate keys.

Posted: Wed Jun 15, 2005 6:47 am
by ArndW
The Hash file key must be different from what you've stated, but the general command for the hash file SELECT would read

SELECT HF BY Col1 BREAK.ON Col1 DET.SUP

Posted: Wed Jun 15, 2005 3:06 pm
by ray.wurlod
Either sort data to be loaded into the hashed file in reverse order, as Craig suggested, or de-duplicate the data by other means before loading them into the hashed file.
All writes to hashed files via the Hashed File stage are destructive overwrites.
If you use a UV stage to insert rows you will achieve what you want, but generate warnings (row already exists) for each duplicate key value. The UV stage uses SQL.

Posted: Wed Jun 15, 2005 3:22 pm
by Sainath.Srinivasan
Pass into agg stage with 'first' option for the fields.

Posted: Wed Jun 15, 2005 8:45 pm
by Sreenivasulu
Use "first" in Aggregrator stage to get the frist duplicate record
or "last" to get the last duplicate record

Regards
Sreenviasulu