Slow Delete Operation in ODBC Stage

abhilashnair · Post by **abhilashnair** » Thu Mar 10, 2011 11:52 pm

I have a job which deletes from a table for specified ids. The job has a sourcs ODBC stage a transformer and target ODBC. The Database is DB2. The source stage fetches the ids based on the specified conditions and the target stage deletes those from the table. The transformer is for changing the column name since the id columns go by different names on source and target

This job is very slow..Usually 4-5 rows per second. The Config file is a single node file. We used to have a four node file but then that was causing locks on the database so we switched to single node file. Currently the target ODBC stage has Auto partitioning set on it. Do we need to change that ? How do I increase job performance ?

ray.wurlod · Post by **ray.wurlod** » Fri Mar 11, 2011 2:35 am

DELETE is a slow operation, because it's transactional and, if row by row, as in your design, has first to execute a WHERE clause to establish the rows to delete. An index on the column(s) in the WHERE clause will help. How is your transaction size (record count) set?

You can also run on more than one node if you partition on the column(s) in the WHERE clause, to prevent cross-node locking from occurring.

abhilashnair · Post by **abhilashnair** » Fri Mar 11, 2011 4:18 am

I need to sign up as a premium member soon

abhilashnair · Post by **abhilashnair** » Tue Mar 15, 2011 3:34 am

ray.wurlod wrote:DELETE is a slow operation, because it's transactional and, if row by row, as in your design, has first to execute a WHERE clause to establish the rows to delete. An index on the column(s) in the WHE ...

I created an index on the column being used in the where condition and also changed the partitioning to Hash. Also I specified arrray size as 2000 in the target ODBC where delete is happening. Still there is no improvement in performance.

We are still using a single node file. How we can we make this work on 4 node file without causing deadlock ?

chulett · Post by **chulett** » Tue Mar 15, 2011 7:20 am

Try it now that you are hash partitioning...

abhilashnair · Post by **abhilashnair** » Tue Mar 15, 2011 7:38 am

chulett wrote:Try it now that you are hash partitioning...

Still the performance is the same..Maybe because I am running job on single node..so hash partitioning is of no use

chulett · Post by **chulett** » Tue Mar 15, 2011 7:42 am

abhilashnair wrote:How we can we make this work on 4 node file without causing deadlock ?

chulett wrote:Try it now that you are hash partitioning...

Is that more clear?

abhilashnair · Post by **abhilashnair** » Wed Mar 16, 2011 9:50 pm

I tried hash partitioning on target ODBC where delete is happening. The key used in partition is the same on which index is created on the table and also used in Where statement.

I am getting SIGSEGV error and job is aborting. Please help

chulett · Post by **chulett** » Wed Mar 16, 2011 10:54 pm

Those Segment Violation error signals can be tough to track down. Have you done an exact search for SIGSEGV here? Can you post the complete, unedited error message here?

abhilashnair · Post by **abhilashnair** » Thu Mar 17, 2011 4:37 am

Strangely, when I ran the job today, it went off fine. The only change I did was setting APT_DISABLE_COMBINATION =True. This I did, so that I could know exactly where the error is coming. I was planning to run the job and then paste "the complete, unedited error message here" as chulett wanted me to do...But it went off just fine without even a single warning

For cross checking I queried the table as well to verify if the delete has happened. It looks good !!!!

I will keep this job under "observation" for 2 or 3 runs and let u guys know

ray.wurlod · Post by **ray.wurlod** » Thu Mar 17, 2011 3:58 pm

It may be the case that the total memory demanded by the single process running the combined operators was excessive, and disabling operator combination means more processes but less per-process memory demand.

abhilashnair · Post by **abhilashnair** » Tue Mar 22, 2011 11:16 pm

The SIGSEGV has returned to haunt this job again. The error message is

StageName,2: Operator terminated abnormally: received signal SIGSEGV.

The StageName corresponds to the target ODBC stage where deletes are happening. I changed the job to single node and it ran fine. But tooooo slow..

How can I rectify this ?

chulett · Post by **chulett** » Wed Mar 23, 2011 6:37 am

Probably time to involve your official support provider.