Incremental Loading Without Date fields in source

ramesh9394 · Post by **ramesh9394** » Sat Mar 23, 2013 6:22 pm

Hi
my source is Sql server and Target oracle
i am trying to design a job to load Delta records ..but the source doesn't have any date fields ..so whenever i am loading data source fetching all the records...is there any way that i can fetch latest records from source
in other words
today i loaded 100 records after initial load...when running the job tomorrow i need only records fof tomorrow i don't want to fetch yesterday's 100 records again

ray.wurlod · Post by **ray.wurlod** » Sat Mar 23, 2013 8:40 pm

Really you have again neglected to provide us with sufficient information.

There may or may not be a mechanism for solving this problem. It really depends whether there is some way in the source data for identifying the delta.

For example if the source has an auto-incrementing key you can store the highest key retrieved today and tomorrow fetch just those records with a key value greater than this.

ramesh9394 · Post by **ramesh9394** » Sat Mar 23, 2013 8:46 pm

thanks for reply ray
i am using DS8.7 using Change Capture stage to capture the delta records
using odbc_connector as source and oracle_connector as target

ray.wurlod · Post by **ray.wurlod** » Sat Mar 23, 2013 11:13 pm

That doesn't affect my earlier answer.

jwiles · Post by **jwiles** » Sat Mar 23, 2013 11:43 pm

Has the client provided a method for identifying the new rows other than comparing the data row-to-row? Are there any unique keys available?

Regards,

ramesh9394 · Post by **ramesh9394** » Sun Mar 24, 2013 12:27 am

not provided any thing(any field which has update time stamp/system gen time stamp) to identify new rows...there is unique key... looks like i have to fetch everything from source every time ..but still looking for alternatives

i tried left joining source with target but even it also do the same(fetching all data and looking up on target)

chulett · Post by **chulett** » Sun Mar 24, 2013 1:21 am

This "unique key", is it a surrogate or any kind of field that increments in such a manner you can use it for your high water mark? Without that or a timestamp all you can do is pull everything and compare.

ramesh9394 · Post by **ramesh9394** » Sun Mar 24, 2013 2:44 am

unique key is not surrogate key its natural primary key

chulett · Post by **chulett** » Sun Mar 24, 2013 8:52 am

Can you create a work table in the source to hold the PK values you've extracted? Or are they willing to add a timestamp to the source? We're running out of options here.

BI-RMA · Post by **BI-RMA** » Mon Mar 25, 2013 8:07 am

Two questions:

1) Is fetching the delta relevant for your process?

You have been talking about a hundred rows in your first post. For something below a hundred thousand rows I would not consider the issue relevant, because Change-Captured is going to identify changes in almost no time at all.

2) Is the only kind of change possible an insert of new key-values, or is it possible that data-rows for known keys change content on non-key-columns? In the latter case you have to fetch all rows from your source anyway.

ramesh9394 · Post by **ramesh9394** » Mon Mar 25, 2013 3:28 pm

Thanks for reply
100 records is just an example i told...records are in millions from source ..
i am fetching all delta records like new inserted records and updated one(no-key value changes for known keys)..and also deleted records

ray.wurlod · Post by **ray.wurlod** » Mon Mar 25, 2013 3:48 pm

If there is no way to identify the delta at source then you will be obliged to fetch all the data from source every time and compare against target.

rameshrr3 · Post by **rameshrr3** » Tue Mar 26, 2013 2:35 pm

Depending on the version of SQL server you have , Change tracking can be enabled at the database level . Once you have done that . I think its an easy task to get only newly modified records( Adds/Updates)

Here is a good link for
Change Tracking in SQL Server 2008