Getting better upsert performance

dsisbank · Post by **dsisbank** » Fri Dec 29, 2006 7:02 am

Hi,
I use write medhod UPSERT with upsert mode Update and Insert,but performance is very bad.When i use Insert and Update mode,performance is very very good but data is doubled if the data is already loaded before.
How can i solve this?

DSguru2B · Post by **DSguru2B** » Fri Dec 29, 2006 8:25 am

Why is the data doubling? That should not happen. Try trimming the keys and then sending it in. What about at the database end, are the keys a surrogate number or a character key?

Your seeing an increase in performance for 'insert then update' because there are more inserts then updates. The opposite will slow it down for the same reason. The best approach, split the inserts and updates.

chulett · Post by **chulett** » Fri Dec 29, 2006 8:40 am

dsisbank wrote:I use write medhod UPSERT with upsert mode Update and Insert,but performance is very bad.When i use Insert and Update mode,performance is very very good but data is doubled if the data is already loaded before.

You need unique indexes of some sort, PK or otherwise, to make anything like this work. It only does the second action if the first one fails after all.

If Update then Insert is slow, it's because it is doing full table scans for each record. If Insert then Update is fast but 'doubles' everything, that's because it is just inserting, which is allowed because you've got nothing to enforce uniqueness. I'll bet you if you run it again, it 'triples'... etc.

Add a unique index over the 'key' fields then pick the methodology that will get you the most 'hits' on the first try. Or as noted, split your actions.

keshav0307 · Post by **keshav0307** » Fri Dec 29, 2006 7:20 pm

you must have unique index on the upsert key columns ( all the column you have used in the where clause of update statement).

kcbland · Post by **kcbland** » Sat Dec 30, 2006 10:03 pm

INSERTing is usually very fast, UPDATEs are notoriously slow. The best methods for loading are to partition align your data with the database, avoid global indexes on table structures, pre-determine inserts from updates, pure insert load your inserts, insert your updates into a runtime table and use parallel DML to mass update your data within the database environment.

DSXchange

Getting better upsert performance

Getting better upsert performance

Re: Getting better upsert performance