delete large number of records in DB2

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Back to Premium Content... If you're doing something like a daily rebuild of a datamart, truncation or reorg methods are optimal. No matter what you're doing, you have to fit within rollback. :wink:

Dave's original question is about removing 1/65th of the data weekly and adding 1/65th more. Could you justify a reorg for 1/65th of your data being deleted, if you consider database downtime during the reorg as well as the nodes being slammed for the duration? If you reorg every 4th week, at least you've moved to 1/16th and can schedule it around users needs.

My preference is to never delete as part of the ETL, but to update with a logical delete (a delete indicator column) so that the operation is reversible should you later discover errant data, or simply wish to audit the process. Then, during a maintenance cycle physically delete the older logically deleted data and then if the delete/non-deleted ratio reaches a threshold then conduct the reorg.

The strategies definetely have to be worked out. Great discussion!
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
shamshad
Premium Member
Premium Member
Posts: 147
Joined: Wed Aug 25, 2004 1:39 pm
Location: Detroit,MI

Post by shamshad »

Regardless of what ETL Tool one uses, if ETL code is generating one DELETE statement per row to be deleted, it always a database overhead...In most cases when that ETL code runs and it starts firing multiple delete statements, databases server is bogged down by the number of SQL statements.....I have seen this problem in many warehouse projects.....

As someone suggested earlier, one of the ways to address the issue is to dump all the rows that are to BE DELETED in a Table and then Issue a DELETE STATEMENT for every row in the TEMP table.

Example:
If all the rows to be deleted are saved in a table called "EmptoDelete" that has EmployeeID and EmployeeMaster is the target....

Then do something like

DELETE FROM EMPLOYEEMASTER WHERE EMPLOYEEID IN
(
SELECT EMPLOYEEID FROM EMPTODELETE
);

This will generate one SQL statement but remember Delete has its own overhead. Also make sure that proper index are in place to speedup the query.
manojmathai
Participant
Posts: 23
Joined: Mon Jul 04, 2005 6:25 am

Post by manojmathai »

Hi

You can alter the table using command so that the table will not write to rollback segment. I am not sure about the commands. Its some thing like

ALTER TABLE table_name ACTIVATE NOT LOGGED INITIALLY;
DELETE FROM table_name...
COMMIT;

This will help you delete from the table faster.

Thanks
Manoj
Post Reply