Page 1 of 1

Starting of Data Stage Job

Posted: Thu May 15, 2008 6:19 am
by Pavan_Yelugula
Hi All
We are scheduling a Data Stage Job to be started every day at 7PM which is getting me the incremental load from a database.
My select query is as follows
select * from XX where datetime >= (sysdate-1)

Records are being pumped in to this table every second.

Is there a chance i will miss out some records because the data stage server takes a few seconds based on the server utilization for starting my job or will the job start for sure every day at 7PM and i will not miss any records

Thanks
Pavan

Re: Starting of Data Stage Job

Posted: Thu May 15, 2008 9:00 am
by Ultramundane
Pavan_Yelugula wrote:Hi All
We are scheduling a Data Stage Job to be started every day at 7PM which is getting me the incremental load from a database.
My select query is as follows
select * from XX where datetime >= (sysdate-1)

Records are being pumped in to this table every second.

Is there a chance i will miss out some records because the data stage server takes a few seconds based on the server utilization for starting my job or will the job start for sure every day at 7PM and i will not miss any records

Thanks
Pavan
Yes

Posted: Thu May 15, 2008 9:48 am
by bcarlson
We have a similar process that pulls changes from a source database every morning at 1am. To guarantee that we don't miss anything, our where clause intentionally overlaps with the last period:

Code: Select all

select <whatever>
from <source table>
where update_date >= CURRENT DATE - 2 days
;
We do it by date, but the same applies to using a timestamp. If you are running at 7pm and want everything from yesterday at 7pm to today, then look at yesterday from 6pm to today at 7pm instead. Overlap by an hour.

You will need to dedup your input data against your target table, or eliminate any matches from the input to your target, but that should eliminate the possibility of missing anything.

Hope this helps.

Brad.