Slow Loading Performance - Oracle 10g

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

johm73
Charter Member
Charter Member
Posts: 57
Joined: Tue Jan 11, 2005 3:47 pm

Slow Loading Performance - Oracle 10g

Post by johm73 »

Source: 10gR2
Target: 10gR2

Pulling about 210K rows
54K get rejected to seq file
156K get loaded to target

Total load time takes 10 min. Rows/S is around 270. I've tried multiple array size settings, tuned query on source, but I cannot get any better performance than that. I've worked with other ETL tools and generally get around 3000-5000 Rows/S.

I'm curious of others thoughts on my current performance and what my expectations should be. I realize it is very system dependant, but does anyone know DataStage Server tuning parameters on AIX that could be "looked-into"?

I find it hard to believe that such a simple job is taking this long to run.

Thanks.

Jeff
I_Server_Whale
Premium Member
Premium Member
Posts: 1255
Joined: Wed Feb 02, 2005 11:54 am
Location: United States of America

Post by I_Server_Whale »

Jeff,

Hmm.....Are they just inserts, or just updates or upserts?

Whale.
Anything that won't sell, I don't want to invent. Its sale is proof of utility, and utility is success.
Author: Thomas A. Edison 1847-1931, American Inventor, Entrepreneur, Founder of GE
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

One reason is rejects. Try to get rid of your rejects. Correction: Try to get rid of rejects that create warnings in your log. Are you getting warnings? Also if your doing upserts, they are the worst. Try splitting your streams and doing inserts and updates seperately.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
I_Server_Whale
Premium Member
Premium Member
Posts: 1255
Joined: Wed Feb 02, 2005 11:54 am
Location: United States of America

Post by I_Server_Whale »

Also, How are you handling the rejects in the transformer?
Anything that won't sell, I don't want to invent. Its sale is proof of utility, and utility is success.
Author: Thomas A. Edison 1847-1931, American Inventor, Entrepreneur, Founder of GE
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

1. What is your source database resource load when your process is running?
2. What is your target database resource load when your process is running?
3. What is your DataStage resource load when your process is running?
4. What is your DataStage job process CPU utilization when running?
5. What is the nature of your source data? 100M rows of 1 integer column will perform differently than 1M rows of 1000 integer columns when your unit of measure is ROWS/SEC.
6. What is the nature of the target table? Is it a partitioned table? Does it have a lot of indexes? Are there triggers on the table?
7. What is the DML you're using? INSERT ONLY will generate a lot of rejects on duplicated keys. INSERT else UPDATE tries the insert first, then does the update. UPDATE else INSERT tries the update first, then does the insert.
8. If most of your data is updates, that's significantly slower than doing inserts. If most of your data is inserts, what's the table growth factors?
9. What is the SQL pulling out of the source table? Is it a straight table dump or a complicated query?


This is just off the top of my head. Your job design probably looks like OCI --> XFM --> OCI, which is the simplest job to design and almost guaranteed to be the slowest job in performance. You're a premium member, you have complete access to read the methods and design characteristics whereby we move millions of rows at significant speeds using Server.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
johm73
Charter Member
Charter Member
Posts: 57
Joined: Tue Jan 11, 2005 3:47 pm

Post by johm73 »

[quote="DSguru2B"]One reason is rejects. Try to get rid of your rejects. Correction: Try to get rid of rejects that create warnings in your log. Are you getting warnings? Also if your doing upserts, they are the worst. Try splitting your streams and doing inserts and updates seperately.[/quote]


They are only inserts, no updates at all. There is only 1 warning of type Reject in the log. It gives me the total number of rejected rows. I push the lookups down to the database within the source query rather than use hash files (they were making my job even slower). So, I know the rejected (failed lookup) rows right away by checking for null in 2 specific mapped columns.

Do you guys have general thoughts on the throughput (rows/S) a well tuned job should be getting? Are my initial thoughts ok or out-of-bounds with DataStage?
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Plus ditch the rejects. :( The error mechanism is too verbose and generates significant network traffic plus all of the job logging drops you to a crawl. If you're simple using the rejects mechanism to determine new rows versus existing, you're better off with a hashed lookup.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

You don't measure in ROWS/SEC. Performance is measured by CPU utilization. If your DS process thread can run at its maximum speed, which for a single-threaded process is a fully utilized CPU, then you have a tuned job. If the process is waiting on network traffic, as in a query sending data across the network, coupled with the time the database takes to collect and send the data, then the DS process will probably be using less than a whole CPU.

You need to concentrate on isolating processing tasks to remove as many variables as you can. Jobs should not mix source data extraction and target table loading within the same stream. You cannot optimize or parallelize in those cases. If you need to invoke multiple job instances and partition your processing stream, creating parallel transformation pipelines, you can't because of the intermixing of databases into the mix.

You've also limited yourself to not using hashed files, the primary means for excellent reference processing.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

I am loading Oracle 10g with ~1.5KB record size with a throughput of 2.5Krows/sec. I am talking about 2 cpu server engine.All inserts. It also depends upon your record size, table indices etc. Out of curiosity. What transaction size/array size combo you tried ?
Last edited by DSguru2B on Fri Jan 12, 2007 4:13 pm, edited 1 time in total.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
johm73
Charter Member
Charter Member
Posts: 57
Joined: Tue Jan 11, 2005 3:47 pm

Post by johm73 »

[quote="kcbland"]1. What is your source database resource load when your process is running?
2. What is your target database resource load when your process is running?
3. What is your DataStage resource load when your process is running?
4. What is your DataStage job process CPU utilization when running?
5. What is the nature of your source data? 100M rows of 1 integer column will perform differently than 1M rows of 1000 integer columns when your unit of measure is ROWS/SEC.
6. What is the nature of the target table? Is it a partitioned table? Does it have a lot of indexes? Are there triggers on the table?
7. What is the DML you're using? INSERT ONLY will generate a lot of rejects on duplicated keys. INSERT else UPDATE tries the insert first, then does the update. UPDATE else INSERT tries the update first, then does the insert.
8. If most of your data is updates, that's significantly slower than doing inserts. If most of your data is inserts, what's the table growth factors?
9. What is the SQL pulling out of the source table? Is it a straight table dump or a complicated query?


This is just off the top of my head. Your job design probably looks like OCI --> XFM --> OCI, which is the simplest job to design and almost guaranteed to be the slowest job in performance. You're a premium member, you have complete access to read the methods and design characteristics whereby we move millions of rows at significant speeds using Server.[/quote]


The source database and target database are the same, just different schemas.

I'm the only resource on the database or the datastage server; nothing else is running on them.

only doing inserts.

the query isn't too complex but the tables are somewhat large. But I've got it tuned as well as i can get it..for now as always room for improvement i'm sure.

Can you tell me how I access these methods and design to get max performance? Post links or give me search text to use...Please.
johm73
Charter Member
Charter Member
Posts: 57
Joined: Tue Jan 11, 2005 3:47 pm

Post by johm73 »

[quote="kcbland"]You don't measure in ROWS/SEC. Performance is measured by CPU utilization. If your DS process thread can run at its maximum speed, which for a single-threaded process is a fully utilized CPU, then you have a tuned job. If the process is waiting on network traffic, as in a query sending data across the network, coupled with the time the database takes to collect and send the data, then the DS process will probably be using less than a whole CPU.

You need to concentrate on isolating processing tasks to remove as many variables as you can. Jobs should not mix source data extraction and target table loading within the same stream. You cannot optimize or parallelize in those cases. If you need to invoke multiple job instances and partition your processing stream, creating parallel transformation pipelines, you can't because of the intermixing of databases into the mix.

You've also limited yourself to not using hashed files, the primary means for excellent reference processing.[/quote]

I only removed the hash files because they were slowing my jobs down. I get better performance now that I've pushed those lookups into the database. Besides, building the hash file is other jobs and accessing them in another job was extending the overall processing time of my entire load process. hash files are good, no doubt, but not the end all.
johm73
Charter Member
Charter Member
Posts: 57
Joined: Tue Jan 11, 2005 3:47 pm

Post by johm73 »

[quote="DSguru2B"]I am loading Oracle 10g with ~1.5KB record size with a throughput of 2.5Krows/sec. I am talking about 2 cpu server engine.All inserts. It also depends upon your record size, table indices etc. Out of curiosity. What transaction size/array size combo you tried ?[/quote]

my row size is 104. so, I used 1024/104=9.8 array size. trans size i set to 0.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Try Array size 600. Try higher numbers. 10 wont bring you much. Also try transaction size with 0 then 10k or 20k. If you absolutely have to rollback if error occurs then specify transaction size as 0 else try with other numbers.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
johm73
Charter Member
Charter Member
Posts: 57
Joined: Tue Jan 11, 2005 3:47 pm

Post by johm73 »

[quote="DSguru2B"]Try Array size 600. Try higher numbers. 10 wont bring you much. Also try transaction size with 0 then 10k or 20k. If you absolutely have to rollback if error occurs then specify transaction size as 0 else try with other numbers.[/quote]

Ok. I'll give some of that a shot. I've tried other numbers like 32767 and 100000. They gave me the same result. The calculation I gave came from some searching I did and I thought I'd try that.

any thoughts on what the pre-fetch memory should be?
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

You can set that to 500 as well but thats not the bottleneck. Its your target database that has a backlock effect on the source. If you put @FALSE in the constraint of your output link you will see that the input link rows/sec will soar. An array size of 10 is hardly going to make any difference. Where did you get that calculation from?
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Post Reply