How to read DISTINCT from flat file ?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
swades
Premium Member
Premium Member
Posts: 323
Joined: Mon Dec 04, 2006 11:52 pm

How to read DISTINCT from flat file ?

Post by swades »

Hi,

How to read DISTINCT sring from flat file ?
Can any one help ? Please ?
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Pass it through a hashed file, the last duplicate will be retained. Or if you want the first one, pass it through aggregator and retain the first record, grouped on the column you want distinct on.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
swades
Premium Member
Premium Member
Posts: 323
Joined: Mon Dec 04, 2006 11:52 pm

Post by swades »

Thanks for Response

I am using PX so can you please tell me what properties to set in Agrregator Stage ?

Thank
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Ouch, completely overlooked that. Sorry about that. Look into the Remove Duplicate Stage. That will get you what you want.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
swades
Premium Member
Premium Member
Posts: 323
Joined: Mon Dec 04, 2006 11:52 pm

Post by swades »

Thanks a lot
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Remove Duplicates requires sorted input. In increasing order of inefficiency, try the following techniques.
  • Sort within DataStage, specifying "Unique"

    Filter through sort -u

    Remove Duplicates stage

    Access through ODBC and specify DISTINCT in the SQL statement.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply