Loading a Oracle table with Partitions

jreddy · Post by **jreddy** » Thu Jan 10, 2008 7:32 am

Hello,
I need to load 23Mil rows from a view (from another DB on another box) to a table, that is partitioned on 2 columns and lies on yet another box.

Am using Oracle bulk load (Write method LOAD on oracle enterprise stage) to truncate the table and insert data from the source view.
And the performance is very bad, each million takes about a min to load even if this is the only process running on the box and there is minimum activity over the network.

I am assuming that since the source is a view, reading data is sequential, but is there a way to speed up the read and write process. There are no other transformations to data in between, its just a staging process to get data from source into staging area.

thanks

ArndW · Post by **ArndW** » Thu Jan 10, 2008 7:34 am

You first need to identify the bottleneck in your processing. If you write a job that just reads the view into a peek stage what speeds do you achieve - are they the same as when you are writing to Oracle or slower?

jreddy · Post by **jreddy** » Thu Jan 10, 2008 8:50 am

Thank You ArndW,
I ran a test as you suggested and observed that when i read from a view to a peek stage, the speeds are a little over 5k rows/sec more than what i had when i was writing to the oracle table.

Is there anyway, we can improve the speeds at which we can read data or is it always sequential read when reading from an Oracle source (assuming our source does not have any partitions)

I tried using the Partition table option (on source oracle enterprise stage), but i got the error
APT_OracleRowid8PartTable: failed to get the block count.

and i sort of expected it, as its a view.

ArndW · Post by **ArndW** » Thu Jan 10, 2008 8:57 am

I haven't used that option for PX Oracle but would think that you should be able to get it working. But it might be the case that the read speed is being limited by network throughput and not the Oracle database (unlikely in this case, but still possible). You could try doing what we used to do with server jobs - put in a 2nd oracle stage identical to the first and change your SELECT statements to the view so each gets only half the rows (i.e. MOD(CUSTNO,2)=1 or the like) and merge the two streams in the job and still write to a PEEK stage. Has the rows/second speed changed?

jreddy · Post by **jreddy** » Thu Jan 10, 2008 9:40 am

Thank you again ArndW,
I think we are in the right track, as the performance increased, i'd say i saw double the speeds that i had initially when i split up the source into 2 oracle stages. But i need to increase it further, as its now loading 23 mil in 14 min or so.

But i am not sure i understand how splitting the volume is helping increase the performance.
You mentioned about network throughput, I am the only one running jobs on the DS server or executing queries against this DB, so how is network throughput the culprit in my case.. pls let me know, can they be modified at unix/db level

thanks in advance

jreddy · Post by **jreddy** » Thu Jan 10, 2008 9:54 am

I forgot to add that i also added a hint to the source SQL, /*+ PARALLEL (tablename) */ and i think that also helped faster reads, but i would definitely like to work more on the lines of increasing network throughput to gain some more performance.

ArndW · Post by **ArndW** » Thu Jan 10, 2008 10:34 am

No, by splitting it into 2 reads you have shown that the bottleneck is not the network. You should look into the detail docs for Oracle enterprise and get parallel reads for this view working, then you can remove the 2nd stage and use the parallelism from the APT_CONFIG file to drive your throughput.