Page 1 of 1

Incremental Loading Without Date fields in source

Posted: Sat Mar 23, 2013 6:22 pm
by ramesh9394
Hi
my source is Sql server and Target oracle
i am trying to design a job to load Delta records ..but the source doesn't have any date fields ..so whenever i am loading data source fetching all the records...is there any way that i can fetch latest records from source
in other words
today i loaded 100 records after initial load...when running the job tomorrow i need only records fof tomorrow i don't want to fetch yesterday's 100 records again

Posted: Sat Mar 23, 2013 8:40 pm
by ray.wurlod
Really you have again neglected to provide us with sufficient information.

There may or may not be a mechanism for solving this problem. It really depends whether there is some way in the source data for identifying the delta.

For example if the source has an auto-incrementing key you can store the highest key retrieved today and tomorrow fetch just those records with a key value greater than this.

Posted: Sat Mar 23, 2013 8:46 pm
by ramesh9394
thanks for reply ray
i am using DS8.7 using Change Capture stage to capture the delta records
using odbc_connector as source and oracle_connector as target

Posted: Sat Mar 23, 2013 11:13 pm
by ray.wurlod
That doesn't affect my earlier answer.

Posted: Sat Mar 23, 2013 11:43 pm
by jwiles
Has the client provided a method for identifying the new rows other than comparing the data row-to-row? Are there any unique keys available?

Regards,

Posted: Sun Mar 24, 2013 12:27 am
by ramesh9394
not provided any thing(any field which has update time stamp/system gen time stamp) to identify new rows...there is unique key... looks like i have to fetch everything from source every time ..but still looking for alternatives

i tried left joining source with target but even it also do the same(fetching all data and looking up on target)

Posted: Sun Mar 24, 2013 1:21 am
by chulett
This "unique key", is it a surrogate or any kind of field that increments in such a manner you can use it for your high water mark? Without that or a timestamp all you can do is pull everything and compare.

Posted: Sun Mar 24, 2013 2:44 am
by ramesh9394
unique key is not surrogate key its natural primary key

Posted: Sun Mar 24, 2013 8:52 am
by chulett
Can you create a work table in the source to hold the PK values you've extracted? Or are they willing to add a timestamp to the source? We're running out of options here.

Posted: Mon Mar 25, 2013 8:07 am
by BI-RMA
Two questions:

1) Is fetching the delta relevant for your process?

You have been talking about a hundred rows in your first post. For something below a hundred thousand rows I would not consider the issue relevant, because Change-Captured is going to identify changes in almost no time at all.

2) Is the only kind of change possible an insert of new key-values, or is it possible that data-rows for known keys change content on non-key-columns? In the latter case you have to fetch all rows from your source anyway.

Posted: Mon Mar 25, 2013 3:28 pm
by ramesh9394
Thanks for reply
100 records is just an example i told...records are in millions from source ..
i am fetching all delta records like new inserted records and updated one(no-key value changes for known keys)..and also deleted records

Posted: Mon Mar 25, 2013 3:48 pm
by ray.wurlod
If there is no way to identify the delta at source then you will be obliged to fetch all the data from source every time and compare against target.

Posted: Tue Mar 26, 2013 2:35 pm
by rameshrr3
Depending on the version of SQL server you have , Change tracking can be enabled at the database level . Once you have done that . I think its an easy task to get only newly modified records( Adds/Updates)

Here is a good link for
Change Tracking in SQL Server 2008