Performance Tuning by dropping the Index

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
subramanya
Participant
Posts: 22
Joined: Fri Oct 15, 2004 11:53 pm
Location: Bangalore, India
Contact:

Performance Tuning by dropping the Index

Post by subramanya »

Hi,

Heard that dropping the index will give better performance in some jobs. Can anybody explain how to do it and how it happens?

Regards
Subramanya
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

"Dropping an index" is a Database term; it is explained in great and painful detail in the appropriate forums.

Basically when you write to a table, you not only write the data but simultaneously the Database does such things as check column or table constraints, fire off triggers and maintain the indices to that row. This can take quite a bit of processing and I/O time.

In many cases it makes sense to disable or drop an index, load large amounts of data, and when finished have the database rebuild that index (or those indices). Quite often this is much more efficient.

There are drawbacks to this method. First, it takes more access rights to the database tables than most DBAs are willing to give. In addition, if the data contains errors then the rebuild index will fail and DataStage cannot do anything about that. With a defunct index some SELECTS can run incredibly slow - instead of one lookup on an index it will have to do a full table scan.

This is the quick-and-dirty explanation and I've left out a lot of fine detail; but knowledge of the interaction between DataStage (or any ETL tool) and the databases used is a very important aspect of what many of us in this forum do for a living.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The main contraindicator is that, while the index is dropped or disabled, no query can use the index and therefore will be much slower if constraining on the (non-)indexed column.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply