Remove duplicate Vs Upsert

keshav0307 · Post by **keshav0307** » Wed May 21, 2008 9:47 am

one of my job, sometimes run very slowly and some times as expected fast.

it reads a sequential file, remove the duplicate records and insert into oracle target table, using upsert.

will oracle upsert with duplicate source data( if i remove the remove duplicate stage) be faster then my current approach??

wesd · Post by **wesd** » Wed May 21, 2008 2:09 pm

keshav0307 wrote:one of my job, sometimes run very slowly and some times as expected fast.

it reads a sequential file, remove the duplicate records and insert into oracle target table, using upsert.

will oracle upsert with duplicate source data( if i remove the remove duplicate stage) be faster then my current approach??

Difficult to say with the information you've provided. I would do the remove duplicates instead of relying on the DB engine to update the same row twice.

jasper · Post by **jasper** » Thu May 22, 2008 12:09 am

I would focus more on the upsert to oracle part to find your fluctuations. Here the performance changes a lot depending on how many records are inserts and how many are updates.
I once did a quick test on our environment where all inserts was 6 times faster then all updates. In most cases the job will be faster if you are able to split the insert and updates and send them to 2 seperate oracle stages.(which are offcource not set to upsert).

keshav0307 · Post by **keshav0307** » Thu May 22, 2008 2:09 am

it will be insert only, if i use remove duplicate stage.

DSXchange

Remove duplicate Vs Upsert

Remove duplicate Vs Upsert

Re: Remove duplicate Vs Upsert