Problem with delete using DB2 enterprise stage

vigneshra · Post by **vigneshra** » Sun Aug 13, 2006 11:20 am

Hi

We have an issue in a parallel job that purges the records from a DB2 UDB table. We extract key columns from table1 and deleting records from table2. table1 and table2 have the same set of key columns. When the job is run, it is not deleting all the records in table2. It means that for each run, it is deleting some records and on rerun, its deleting some more but not everything in one go. The array size and transaction size are set to 50000. The target DB2 stage is executed in parallel and the partitioning is set to DB2. Job is running perfectly alright in development environment but not in production. So we want to identify the rootcause of this issue. Any help is appreciated.

kris007 · Post by **kris007** » Sun Aug 13, 2006 11:40 am

If you delete that many records, the transaction log size is going to be full.
But that should give you an error, because you did not get any error you may not have this problem right now.Make sure you are not logging these delete activities.

Daddy Doma · Post by **Daddy Doma** » Sun Aug 13, 2006 11:50 pm

If you want to replace the entire table it would be better NOT to use Delete (or Truncate) with DB2 in DataStage, because a "Delete *" will be passed to the database and the log files will fill up.

Faced with a similar situation, I wrote a script to create an empty file in Unix and then Replace the database table, truncating all the records but leaving column names and data types intact. I also had a second script that I used to restore referential integrity to the multiple database tables.

You would create a custom routine that allows the user to enter the database name, schema and table. Then in a sequence job, use this routine to truncate your table before loading it. No need for a Delete *, no log files get filled and the performance will be a lot quicker.

vmcburney · Post by **vmcburney** » Mon Aug 14, 2006 6:09 pm

Great tip Diddy.

You should also have a look at capturing your sql codes from the failed delete statements. I am afraid you are suffering from row leakage, follow the link for instructions on trapping error messages from your DB2 enterprise stage. You might fix this one problem but you still might have rows leaking out of your other DB2 enterprise stages.

bakul · Post by **bakul** » Mon Mar 12, 2007 1:17 am

What are the options when you want to delete based on a key, say date, and not truncate the entire table?
Selecting the rows from the database first and then using the keys to delete is one. But wouldn't that lead to select of millions of rows from the table, landing to a dataset and then firing the delete query when the delete query does not require these keys and could just as well work with the Date key?

kumar_s · Post by **kumar_s** » Mon Mar 12, 2007 1:55 am

Hi Bakul, If I understand, you need to delete on date basis. How big is the table, ie., how many days of data? If the data that you tend is delete is less when compared to the exisinting data, DELETE FROM <> WHERE will work. May be if you are concert about transaction, you can delete on some chunk with help of loop in SP.
If the data data you tend to delete is about more that 50%, you can insert the rest of the rows into another table, and truncate the current table.

bakul · Post by **bakul** » Mon Mar 12, 2007 2:24 am

Thanks for the reply Kumar.
The data that I need to delete is not 50% of the table but it still causes the transaction log to overflow.

Deleting chunks with loops seems the preferred approach.
Could you please explain what you mean by "delete on some chunk with help of loop in SP. " ? Am not sure what you mean by SP.

kumar_s · Post by **kumar_s** » Mon Mar 12, 2007 4:26 am

I just said that, rather than deleting the whole bunch of rows at a go, you can delete say 5000 or 10,000 rows per transaction. So that you can avoid the transaction log to not get full. You can write a stored procedure and loop it till you get no data found for a given condition. And commit after every 5K or 10K transaction. If you want to do in Datastage, try DELETE and try setting Transaction size to a higher value.