Page 1 of 1

Identify recent record by date

Posted: Wed Aug 10, 2005 7:29 am
by vinaymanchinila
Hi,
We have SCD process. I need to read the records for a given set of keys and extract the row with highest Date value. Have searched in the forum but all of them are pointing on how to implement SCD , here I need to extract the record which has multiple occurances.

Thanks,

Re: Identify recent record by date

Posted: Wed Aug 10, 2005 7:36 am
by rsaliah
Is there a reason why you can't sort the data to get the max date?

Posted: Wed Aug 10, 2005 7:38 am
by vinaymanchinila
I can sort the data , and once I sort the data for a given key if there are 4 records, how do I tell DataStage to extract the record with max date?
Thanks,

Posted: Wed Aug 10, 2005 7:41 am
by chulett
Sounds like a sql problem to me, not a DataStage one. Isn't there someone on site that can help you write your source query? :?

Posted: Wed Aug 10, 2005 7:45 am
by rsaliah
OK, hopefully I haven't misunderstood. But the way I would do it is to extract all the rows and only keep the one I need. I would probably do this using Stage Variables or easier still in an aggregator stage.

Posted: Wed Aug 10, 2005 7:45 am
by vinaymanchinila
I wanted to do this in DataStage, yea we do have SQL guys here and I was also thinking of like
"Select * from table1 group by key"

Thanks,

Posted: Wed Aug 10, 2005 8:54 am
by morneh
Not quite sure what the actual question here is, but if I understand correctly you want the record with the max date if it gives you 4 occurences ?

If you want to do it in DS instead of doing it in the Database's SQL (which would be the better way of doing it), try an order by clause in your source, ordering by the date column. Then pump it into a hash file.

The hash file will overwrite the duplicates with the next incoming record.

So when the process is finished the hash file will contain a distinct group of records with the max date of each unique key.

LONG way of doing it though...

Morne

Posted: Wed Aug 10, 2005 8:59 am
by vinaymanchinila
I want the record with the max date if it gives you 4 occurences, and if I write it to ahash file , will the order by in the source makes sure that only the record with latest date will be written to the hash file !
Thanks,

Posted: Wed Aug 10, 2005 9:04 am
by chulett
Last one in wins, so...

Posted: Wed Aug 10, 2005 2:58 pm
by DaleK
If you want to do it via SQL I believe the SQL should look something like this.
Select * (or whatever columns you want)
from SCDim_table SCD
where SCD.insert_date = (select max(SCD1.insert_date
from SCDim_table SCD1
where SCD.key_col = SCD1.key_col)

That would be my 1st crack at the SQL.

Posted: Wed Aug 10, 2005 3:00 pm
by vinaymanchinila
Hi Dale,
thats exactly what I have settled with !
Thanks