Improving performance of deletes

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
akarsh
Participant
Posts: 51
Joined: Fri May 09, 2008 4:03 am
Location: Pune

Improving performance of deletes

Post by akarsh »

Hi All,

I am facing the same issue as seen here.

I have 3.7L record in file which i need to delete from table which contains 90M.I Am using below query
DELETE
FROM
Table
WHERE
Field1 = ORCHESTRATE.Field1

Design is simple
DS -> Tfm -> Oracle connector
in designer till connector data is moving as 877962 record/sec but from oracle its moving with the rate of 450 record/sec.
i have kept transaction size as well as array size both as 40000. and also did hash partition on field1 which is key field.
Rest settings are default.

Please help how to improve the performance.
Thanks,
Akarsh Kapoor
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I needed to clean up the quintuple posting you managed somehow ( I think that's a new record :wink: ) and then moved you off to your own post. Similar != same, so best to post your own details and link back to anything older you think may be relevant. As I've done here.

First question is easy - do you have an index over "Field1"? Without it you are doing a full table scan for each delete.

Oh... and please don't ask the same question in multiple threads.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

:idea: No need to ask your question again, it's right here... safe and sound.
-craig

"You can never have too many knives" -- Logan Nine Fingers
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

That's a really massive array / transaction size. Have you tried with a smaller array / transaction size?
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

I wonder if you insert your Field1 into an empty temp.table, you could then execute an after sql statement of "delete from table where filed1 in temp.table"

might be overkill.

But that field one column better sure be a index/key for speed.

I am not a database savvy guy.

reorgs depending on the database type are a good idea.

pre-sorting the field1 would be handy in most cases.
akarsh
Participant
Posts: 51
Joined: Fri May 09, 2008 4:03 am
Location: Pune

Improving performance of deletes

Post by akarsh »

Thanks chulett for deleting multiple post. :D
Even was not able to find how come multiple post i made :)

Well yes table is indexed on field one, and am trying to run using DRS connector in place of oracle connector. will update you if any thing positive as so far facing space issue in DB.
Thanks,
Akarsh Kapoor
akarsh
Participant
Posts: 51
Joined: Fri May 09, 2008 4:03 am
Location: Pune

Re: Improving performance of deletes

Post by akarsh »

Hi All,

I am back with the same issue :( .

Now job is giving deadlock issue. Let me give details

Ds -> Tfm -> DRS connector

Ds has single field, on which data is hash partitioned and index is present in table on same field(this field is part of composite key, so composite index is present). In DRS connector have kept commit interval is 80000 as data is huge.

DS has non duplicate records. and has huge data. table also has app 100M records on which delete opearation needs to be done.

Please suggest what needs to be done so that job completes with out giving performance issue. :(
Thanks,
Akarsh Kapoor
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Deadlocks are not a 'performance issue'. And I'd wager if you took the error you're getting here and searched for it, you'd get the answer you're looking for. Failing that, post the error.
-craig

"You can never have too many knives" -- Logan Nine Fingers
akarsh
Participant
Posts: 51
Joined: Fri May 09, 2008 4:03 am
Location: Pune

Post by akarsh »

Hi Chulett,

We updated the job to remove deadlock issue by running DRS connector in seq mode and it took close to 12 hr to finish, which is not expected.

Please suggest if any change we can do now?
Thanks,
Akarsh Kapoor
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Search. Post the error.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply