Slow Delete Operation in ODBC Stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
abhilashnair
Participant
Posts: 284
Joined: Fri Oct 13, 2006 4:31 am

Slow Delete Operation in ODBC Stage

Post by abhilashnair »

I have a job which deletes from a table for specified ids. The job has a sourcs ODBC stage a transformer and target ODBC. The Database is DB2. The source stage fetches the ids based on the specified conditions and the target stage deletes those from the table. The transformer is for changing the column name since the id columns go by different names on source and target

This job is very slow..Usually 4-5 rows per second. The Config file is a single node file. We used to have a four node file but then that was causing locks on the database so we switched to single node file. Currently the target ODBC stage has Auto partitioning set on it. Do we need to change that ? How do I increase job performance ?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

DELETE is a slow operation, because it's transactional and, if row by row, as in your design, has first to execute a WHERE clause to establish the rows to delete. An index on the column(s) in the WHERE clause will help. How is your transaction size (record count) set?

You can also run on more than one node if you partition on the column(s) in the WHERE clause, to prevent cross-node locking from occurring.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
abhilashnair
Participant
Posts: 284
Joined: Fri Oct 13, 2006 4:31 am

Post by abhilashnair »

I need to sign up as a premium member soon :) :roll:
abhilashnair
Participant
Posts: 284
Joined: Fri Oct 13, 2006 4:31 am

Post by abhilashnair »

ray.wurlod wrote:DELETE is a slow operation, because it's transactional and, if row by row, as in your design, has first to execute a WHERE clause to establish the rows to delete. An index on the column(s) in the WHE ...
I created an index on the column being used in the where condition and also changed the partitioning to Hash. Also I specified arrray size as 2000 in the target ODBC where delete is happening. Still there is no improvement in performance.

We are still using a single node file. How we can we make this work on 4 node file without causing deadlock ?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Try it now that you are hash partitioning...
-craig

"You can never have too many knives" -- Logan Nine Fingers
abhilashnair
Participant
Posts: 284
Joined: Fri Oct 13, 2006 4:31 am

Post by abhilashnair »

chulett wrote:Try it now that you are hash partitioning...
Still the performance is the same..Maybe because I am running job on single node..so hash partitioning is of no use
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

abhilashnair wrote:How we can we make this work on 4 node file without causing deadlock ?
chulett wrote:Try it now that you are hash partitioning...
Is that more clear?
-craig

"You can never have too many knives" -- Logan Nine Fingers
abhilashnair
Participant
Posts: 284
Joined: Fri Oct 13, 2006 4:31 am

Post by abhilashnair »

I tried hash partitioning on target ODBC where delete is happening. The key used in partition is the same on which index is created on the table and also used in Where statement.

I am getting SIGSEGV error and job is aborting. Please help
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Those Segment Violation error signals can be tough to track down. Have you done an exact search for SIGSEGV here? Can you post the complete, unedited error message here?
-craig

"You can never have too many knives" -- Logan Nine Fingers
abhilashnair
Participant
Posts: 284
Joined: Fri Oct 13, 2006 4:31 am

Post by abhilashnair »

Strangely, when I ran the job today, it went off fine. The only change I did was setting APT_DISABLE_COMBINATION =True. This I did, so that I could know exactly where the error is coming. I was planning to run the job and then paste "the complete, unedited error message here" as chulett wanted me to do...But it went off just fine without even a single warning

For cross checking I queried the table as well to verify if the delete has happened. It looks good !!!!


I will keep this job under "observation" for 2 or 3 runs and let u guys know
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It may be the case that the total memory demanded by the single process running the combined operators was excessive, and disabling operator combination means more processes but less per-process memory demand.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
abhilashnair
Participant
Posts: 284
Joined: Fri Oct 13, 2006 4:31 am

Post by abhilashnair »

The SIGSEGV has returned to haunt this job again. The error message is

StageName,2: Operator terminated abnormally: received signal SIGSEGV.

The StageName corresponds to the target ODBC stage where deletes are happening. I changed the job to single node and it ran fine. But tooooo slow..

How can I rectify this ?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Probably time to involve your official support provider.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply