Decrease in Load Performance - Unsure of Cause

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
asingh
Premium Member
Premium Member
Posts: 46
Joined: Sun Jan 28, 2007 5:31 pm
Location: Sydney

Decrease in Load Performance - Unsure of Cause

Post by asingh »

A job that has constantly been running quickly (sub 10 seconds) for small volumes is now taking hours to load for a bulk load whichis about 100 times larger.

Apart from the difference in volume, the target table has had an additional column added. Statistics were run on the table prior to the load.

The stage has not had it's table definition modified for the new column however no values need to be loaded for it. Any guesses on whether this or something else could this have an effect on the performance?
nick.bond
Charter Member
Charter Member
Posts: 230
Joined: Thu Jan 15, 2004 12:00 pm
Location: London

Post by nick.bond »

Is it definately the bulk load or the job?

If it's the bulk load recreate the table with your previous creation script and run the job again. Is it quiker? If it is re-check what is different between the two tables? additional indexes? Also monitor the database or ask your DBA to do it for you whilst you run the job, it could just be that something else, like DB backups are running and taking all resources.
Regards,

Nick.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

What stage are you using to load, and what is the nature of the data? sqlldr DIRECT path versus DML is an unfair contest. If you're loading using DML, it means you've updates and inserts. Updates are notoriously intensive.

Would an additional column unstated in the stage definition affect load performance of inserts -- absolutely. It means that the database is having to supply the default value.

You CANNOT project a 10 second load into how long 100 X the data will take. You're dealing with high watermarks, rollback segments, memory, disk, etc constraints that don't factor at a couple of thousand rows.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
asingh
Premium Member
Premium Member
Posts: 46
Joined: Sun Jan 28, 2007 5:31 pm
Location: Sydney

Post by asingh »

Actually the nature of the bulk load is solely updates, I did not mean to imply there were any inserts. The word 'load' is used interchangeably at my workplace for inserts and updates.

The DBA has investigated and not found anything inherently wrong with the SQL being run. The table was re-loaded prior to this run and one opinion offered was that after a re-load the table is initially slow for inserts/updates for the first few runs.

We are using an Oracle 9 OCI stage with the SQL being generated.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

asingh wrote:...and one opinion offered was that after a re-load the table is initially slow for inserts/updates for the first few runs.
What??? :shock: The table needs to be "warmed up" before it's faster? That's a honker. How about "chained row updates?" Maybe the updates are slow because the table is compacted, now that you're squeezing updates into VARCHAR columns maybe you're chaining rows? What about index updates? Maybe nullable indexed columns are now being updated with values? A freshly loaded table with bitmapped indexes will have clean bitmaps, but updates progressively and tragically degrade them. What about partitioning? Maybe the updates are spanning partitions

There's a lot of reasons updates can be slow, but I've never heard of a table getting faster over time, it's always a degradation. A DBA who says "be patient, the table will speed up over time" is worthy of the moniker Don't Bother Asking. :x
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

asingh wrote:The word 'load' is used interchangeably at my workplace for inserts and updates.
And the word 'bulk' is being used incorrectly. A 'bulk' load implies one of the stages that leverage sqlldr and therefor have some form of 'bulk' in their name. You've just got plain ol' DML updates where I guess bulk means 'a whole bunch a'.

I'd wager you are updating on a non-indexed column and doing full table scans. Initial runs seem 'fine' because the table volume is low but as volume increases in the table, your loads get slower and slower.
-craig

"You can never have too many knives" -- Logan Nine Fingers
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
Another performance stopper, especially when your talking about updating columns, that might be causing such a poor performance, might be a bitmap index/s on that table, if they exist.

IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
Post Reply