Page 1 of 1

How to read DISTINCT from flat file ?

Posted: Wed Jan 31, 2007 1:48 pm
by swades
Hi,

How to read DISTINCT sring from flat file ?
Can any one help ? Please ?

Posted: Wed Jan 31, 2007 1:50 pm
by DSguru2B
Pass it through a hashed file, the last duplicate will be retained. Or if you want the first one, pass it through aggregator and retain the first record, grouped on the column you want distinct on.

Posted: Wed Jan 31, 2007 1:59 pm
by swades
Thanks for Response

I am using PX so can you please tell me what properties to set in Agrregator Stage ?

Thank

Posted: Wed Jan 31, 2007 2:01 pm
by DSguru2B
Ouch, completely overlooked that. Sorry about that. Look into the Remove Duplicate Stage. That will get you what you want.

Posted: Wed Jan 31, 2007 2:09 pm
by swades
Thanks a lot

Posted: Wed Jan 31, 2007 4:21 pm
by ray.wurlod
Remove Duplicates requires sorted input. In increasing order of inefficiency, try the following techniques.
  • Sort within DataStage, specifying "Unique"

    Filter through sort -u

    Remove Duplicates stage

    Access through ODBC and specify DISTINCT in the SQL statement.