Remove duplicate Vs Upsert

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
keshav0307
Premium Member
Premium Member
Posts: 783
Joined: Mon Jan 16, 2006 10:17 pm
Location: Sydney, Australia

Remove duplicate Vs Upsert

Post by keshav0307 »

one of my job, sometimes run very slowly and some times as expected fast.

it reads a sequential file, remove the duplicate records and insert into oracle target table, using upsert.

will oracle upsert with duplicate source data( if i remove the remove duplicate stage) be faster then my current approach??
wesd
Participant
Posts: 22
Joined: Mon Aug 16, 2004 8:56 pm

Re: Remove duplicate Vs Upsert

Post by wesd »

keshav0307 wrote:one of my job, sometimes run very slowly and some times as expected fast.

it reads a sequential file, remove the duplicate records and insert into oracle target table, using upsert.

will oracle upsert with duplicate source data( if i remove the remove duplicate stage) be faster then my current approach??
Difficult to say with the information you've provided. I would do the remove duplicates instead of relying on the DB engine to update the same row twice.
Wes Dumey
Senior Consultant
Data Warehouse Projects
jasper
Participant
Posts: 111
Joined: Mon May 06, 2002 1:25 am
Location: Belgium

Post by jasper »

I would focus more on the upsert to oracle part to find your fluctuations. Here the performance changes a lot depending on how many records are inserts and how many are updates.
I once did a quick test on our environment where all inserts was 6 times faster then all updates. In most cases the job will be faster if you are able to split the insert and updates and send them to 2 seperate oracle stages.(which are offcource not set to upsert).
keshav0307
Premium Member
Premium Member
Posts: 783
Joined: Mon Jan 16, 2006 10:17 pm
Location: Sydney, Australia

Post by keshav0307 »

it will be insert only, if i use remove duplicate stage.
Post Reply