Identify recent record by date

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
vinaymanchinila
Premium Member
Premium Member
Posts: 353
Joined: Wed Apr 06, 2005 8:45 am

Identify recent record by date

Post by vinaymanchinila »

Hi,
We have SCD process. I need to read the records for a given set of keys and extract the row with highest Date value. Have searched in the forum but all of them are pointing on how to implement SCD , here I need to extract the record which has multiple occurances.

Thanks,
rsaliah
Participant
Posts: 65
Joined: Thu Feb 27, 2003 8:59 am

Re: Identify recent record by date

Post by rsaliah »

Is there a reason why you can't sort the data to get the max date?
vinaymanchinila
Premium Member
Premium Member
Posts: 353
Joined: Wed Apr 06, 2005 8:45 am

Post by vinaymanchinila »

I can sort the data , and once I sort the data for a given key if there are 4 records, how do I tell DataStage to extract the record with max date?
Thanks,
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Sounds like a sql problem to me, not a DataStage one. Isn't there someone on site that can help you write your source query? :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
rsaliah
Participant
Posts: 65
Joined: Thu Feb 27, 2003 8:59 am

Post by rsaliah »

OK, hopefully I haven't misunderstood. But the way I would do it is to extract all the rows and only keep the one I need. I would probably do this using Stage Variables or easier still in an aggregator stage.
vinaymanchinila
Premium Member
Premium Member
Posts: 353
Joined: Wed Apr 06, 2005 8:45 am

Post by vinaymanchinila »

I wanted to do this in DataStage, yea we do have SQL guys here and I was also thinking of like
"Select * from table1 group by key"

Thanks,
morneh
Participant
Posts: 17
Joined: Wed Jan 28, 2004 8:09 am
Contact:

Post by morneh »

Not quite sure what the actual question here is, but if I understand correctly you want the record with the max date if it gives you 4 occurences ?

If you want to do it in DS instead of doing it in the Database's SQL (which would be the better way of doing it), try an order by clause in your source, ordering by the date column. Then pump it into a hash file.

The hash file will overwrite the duplicates with the next incoming record.

So when the process is finished the hash file will contain a distinct group of records with the max date of each unique key.

LONG way of doing it though...

Morne
vinaymanchinila
Premium Member
Premium Member
Posts: 353
Joined: Wed Apr 06, 2005 8:45 am

Post by vinaymanchinila »

I want the record with the max date if it gives you 4 occurences, and if I write it to ahash file , will the order by in the source makes sure that only the record with latest date will be written to the hash file !
Thanks,
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Last one in wins, so...
-craig

"You can never have too many knives" -- Logan Nine Fingers
DaleK
Premium Member
Premium Member
Posts: 68
Joined: Fri Jun 27, 2003 8:33 am
Location: Orlando

Post by DaleK »

If you want to do it via SQL I believe the SQL should look something like this.
Select * (or whatever columns you want)
from SCDim_table SCD
where SCD.insert_date = (select max(SCD1.insert_date
from SCDim_table SCD1
where SCD.key_col = SCD1.key_col)

That would be my 1st crack at the SQL.
vinaymanchinila
Premium Member
Premium Member
Posts: 353
Joined: Wed Apr 06, 2005 8:45 am

Post by vinaymanchinila »

Hi Dale,
thats exactly what I have settled with !
Thanks
Post Reply