Remove duplicates problem

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
venkatvelpula
Participant
Posts: 80
Joined: Sat Mar 18, 2006 10:27 pm

Remove duplicates problem

Post by venkatvelpula »

Guys,

I have 5 columns in the source, out of that 2 columns are keys. I have three non-key columns, out of that 2 columns are "from_date" (col4)and "to_date"(col5)

goa is to remove duplicated based on key values but some duplicates are valid based on from_date and to_date.

Let's say I have three records like this

col1 (key) col2 (key) col3 col4 col5

123 456 123 04012007 08102007

123 456 345 10102007 03102008

123 456 678 05102007 07102008

Out of three records, first and second are valid because col4 value of the second record is after the date of the first record.

Please let me know, how to get this resolved?
Post Reply