DB2 Connector performance issue

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
neena
Participant
Posts: 90
Joined: Mon Mar 31, 2003 4:32 pm

DB2 Connector performance issue

Post by neena »

Hi everyone, I have been struggling with the job performance issue with update in Db2 connector just wondering if anyone can shed some light on where it might be going wrong.

I have a simple job like below

Code: Select all

Data set  --> Transformer --> DB2 connector (Update)
I will give a high level idea about the data and what we are doing.

We are processing 60 million records and updating on 5 part natural keys.
We are using DB2 connector (Partitioning is set to DB2 connector). Transaction record count = 6000 and Array size = 2000. No sorting done on any keys.

If I have to come to table, table is partitioned by one of the date fields (this field is not in any index). Also we have index that was created on the 5 part natural key that we are updating (Also we had 32 index on this table). So here are the couple of things we have tried but not succeed yet. This is in production.

1. Initially when we tried to update, it was updating 700 rows/sec. So we have dropped couple of index and tried but no use.
2. We have taken out the Foreign key relation and tried but no use.
3. We have removed the partition on the table and tried but no use (Interesting thing is in development at least in this scenario when table is not partitioned we were getting 5000 rows/sec. After partition it was doing 2500 rows/Sec).
4. We did the explain plan on the query and then find that index did not had any over head on the update.

I have tried reading through the forum and found couple of threads with this case but wasn't able to find the answer. Any suggestions I would really appreciate.
neena
Participant
Posts: 90
Joined: Mon Mar 31, 2003 4:32 pm

Post by neena »

When we are doing bulk load, the job is doing well 30,000 rows/sec. Its the updates that causing the issue.

Another interesting thing is we have a similar job which does the updates but the volume of the data for that job is 3 million. That update job is doing 10,000 rows/sec. volume is the major difference, but would that effect the way the threw put of the job?

Only thing we haven't tried is using the field that the target table is partitioned on as one of the key in the index or part of the update query.

For now we found a work around solution that update fields that we are using now became part of the insert load.
MT
Premium Member
Premium Member
Posts: 198
Joined: Fri Mar 09, 2007 3:51 am

Re: DB2 Connector performance issue

Post by MT »

Hi neena
If I have to come to table, table is partitioned by one of the date fields (this field is not in any index). Also we have index that was created on the 5 part natural key that we are updating (Also we had 32 index on this table).
So you got 32 indexes on a table and you wonder about performance - hmm - I think you should talk to your DBA this is always a good idea in such situations.
I index has benefits when you query the data but changing data will cost index maintenance - 32 time in your case.
I assume that you used table partioning (not database partitioning) and you do not have the partitioning key in the indexes as you write so you only have so called global indexes instead of local ones (for each partition).
This is again a good point to talk to your DB2 DBA...
It also depends whether the partitioning key is one of the field which get updated - this would be a reason for moving a row to another partition cauing even more overhead.
So here are the couple of things we have tried but not succeed yet. This is in production.

1. Initially when we tried to update, it was updating 700 rows/sec. So we have dropped couple of index and tried but no use.
So how many a a couple - two? four? not much in your case of over thirty indexes.
2. We have taken out the Foreign key relation and tried but no use.
3. We have removed the partition on the table and tried but no use (Interesting thing is in development at least in this scenario when table is not partitioned we were getting 5000 rows/sec. After partition it was doing 2500 rows/Sec).
I have described a reason above that it could be painful to move updated rows to another partition this COULD be a reason for it to have better performance on a non-partitioned table.
4. We did the explain plan on the query and then find that index did not had any over head on the update.
Well, the index over the five coulmns is absolutely needed to identify the row to be updated - the other indexes on the other hand could be additional work which you would not see in the explain.

To sum it up - talk to your DBA
regards

Michael
Post Reply