Writing to ORAOCI9 stage - Performance improvement

Nagin · Post by **Nagin** » Thu Mar 15, 2007 11:26 am

Hi,
I have a simple job developed by some one else.

ORAOCI9 stage --> Transformer --> ORAOCI9 stage.

I am trying to tune the performance of the job. The job is currently processing 650 rows per second. My source has 10 million rows. At the rate of 650 rps its taking 4 hrs.

We have the same job in Informatica which is processing 6500 rows per sec, 10X faster.

Here are the things I have tried.

The source ORAOCI9 stage had Array size set to 1, I changed it to 32767 (2 to the power 15 -1, as per documentation) and put the @FALSE in the Transformer constraint to see that reading from the DB is not a bottleneck as suggested in one of the posts. Earlier the reading speed used to be 650 rps now after I have changed the Array size and set the constraint in the transformer to @FALSE, the job is reading 6700 rps from source.

Now, I took off the @FALSE in the constraint to allow writing to the target. I had to change the target Array size same as the source as it was also set to 1. Transaction size was set to 0. As my source has 10 million plus rows, I am concerned about running out of buffer and changed the Transaction size to 32767 as per documentaion. There is a Transaction handling tab in the target ORAOCI9 stage which was set to 10,000.
I have changed it to 32767 to make sure it wouldn't become the bottleneck. Now the job is reading just 2100 rows per sec and writing the same.

Any thoughts on how can I improve the performance?

Thank you very much in advance.

kcbland · Post by **kcbland** » Thu Mar 15, 2007 11:35 am

Is this Server or PX? What's the DML like? What's the load on the DS Server? (prstat, topas, top, glance,etc) If Server, why not use multiple job instances with a query partitioning scheme? If PX, why not use Enterprise stages?

Nagin · Post by **Nagin** » Thu Mar 15, 2007 12:36 pm

kcbland wrote:Is this Server or PX? What's the DML like? What's the load on the DS Server? (prstat, topas, top, glance,etc) If Server, why not use multiple job instances with a query partitioning scheme? If PX, why not use Enterprise stages?

This is a server job and DML truncates the table and inserts rows.

I ran topas command on Unix and here are the stats

Kernel 26.2 |######## | Reads 283 Rawin 0
User 31.8 |######### | Writes 38 Ttyout 383
Wait 0.0 | | Forks 0 Igets 0
Idle 42.0 |############# | Execs 0 Namei 139
Physc = 1.02 %Entc= 64.0 Runqueue 1.0 Dirblk 0
Waitqueue 0.0
Network KBPS I-Pack O-Pack KB-In KB-Out
en0 6.1 11.4 10.4 1.7 4.4 PAGING MEMORY
lo0 0.0 0.0 0.0 0.0 0.0 Faults 0 Real,MB 8320
Steals 0 % Comp 46.9

I am going to try using multiple instances with query partioning and see how its going to work out.

Thanks Kenneth.

kcbland · Post by **kcbland** » Thu Mar 15, 2007 12:50 pm

Write one multi-instance job with 2 parameters PartitionCount and PartitionNumber like this:

OCI-->XFM-->SEQ (name like yourfile_#PartitionNumber#.dat)

Put a WHERE clause in to partition the data:

Code: Select all

MOD(pick and integer column like the primary key, #PartitionCount#) = #PartitionNumber# - 1

Run PartitionCount instances, set PartitionNumber from 1 to PartitionCount on each respective instance.

You'll get PartitionCount files. Concatenate them together using a unix cat statement:

Code: Select all

cat yourfiles_*.dat > yourfile.dat

Now, use sqlldr and DIRECT path truncate and load your target table. Put it all in a Sequence and run the extraction jobs Job stage simultaneously, then the Command stage to cat the files, then the Command stage to run sqlldr.

chulett · Post by **chulett** » Fri Mar 16, 2007 6:43 am

Another thought, something not quite as elegant or performant as Ken's but simpler to get to.

Don't use the OCI stage for millions of rows, especially if all you are doing are pure inserts. Swap out your target stage for the Oracle Bulk Loader stage set to 'Automatic' mode. You'll need to arrange for the actual truncation to happen in a separate step, however, as the stage itself doesn't support that. It could be another job, could be something done 'before job', several ways to skin that cat.

Might be worth a shot, as an academic exercise if nothing else.