Writing to ORAOCI9 stage - Performance improvement

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Nagin
Charter Member
Charter Member
Posts: 89
Joined: Thu Jan 26, 2006 12:37 pm

Writing to ORAOCI9 stage - Performance improvement

Post by Nagin »

Hi,
I have a simple job developed by some one else.

ORAOCI9 stage --> Transformer --> ORAOCI9 stage.

I am trying to tune the performance of the job. The job is currently processing 650 rows per second. My source has 10 million rows. At the rate of 650 rps its taking 4 hrs.

We have the same job in Informatica which is processing 6500 rows per sec, 10X faster.

Here are the things I have tried.

The source ORAOCI9 stage had Array size set to 1, I changed it to 32767 (2 to the power 15 -1, as per documentation) and put the @FALSE in the Transformer constraint to see that reading from the DB is not a bottleneck as suggested in one of the posts. Earlier the reading speed used to be 650 rps now after I have changed the Array size and set the constraint in the transformer to @FALSE, the job is reading 6700 rps from source.

Now, I took off the @FALSE in the constraint to allow writing to the target. I had to change the target Array size same as the source as it was also set to 1. Transaction size was set to 0. As my source has 10 million plus rows, I am concerned about running out of buffer and changed the Transaction size to 32767 as per documentaion. There is a Transaction handling tab in the target ORAOCI9 stage which was set to 10,000.
I have changed it to 32767 to make sure it wouldn't become the bottleneck. Now the job is reading just 2100 rows per sec and writing the same.

Any thoughts on how can I improve the performance?

Thank you very much in advance.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Is this Server or PX? What's the DML like? What's the load on the DS Server? (prstat, topas, top, glance,etc) If Server, why not use multiple job instances with a query partitioning scheme? If PX, why not use Enterprise stages?
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Nagin
Charter Member
Charter Member
Posts: 89
Joined: Thu Jan 26, 2006 12:37 pm

Post by Nagin »

kcbland wrote:Is this Server or PX? What's the DML like? What's the load on the DS Server? (prstat, topas, top, glance,etc) If Server, why not use multiple job instances with a query partitioning scheme? If PX, why not use Enterprise stages?
This is a server job and DML truncates the table and inserts rows.

I ran topas command on Unix and here are the stats


Kernel 26.2 |######## | Reads 283 Rawin 0
User 31.8 |######### | Writes 38 Ttyout 383
Wait 0.0 | | Forks 0 Igets 0
Idle 42.0 |############# | Execs 0 Namei 139
Physc = 1.02 %Entc= 64.0 Runqueue 1.0 Dirblk 0
Waitqueue 0.0
Network KBPS I-Pack O-Pack KB-In KB-Out
en0 6.1 11.4 10.4 1.7 4.4 PAGING MEMORY
lo0 0.0 0.0 0.0 0.0 0.0 Faults 0 Real,MB 8320
Steals 0 % Comp 46.9


I am going to try using multiple instances with query partioning and see how its going to work out.

Thanks Kenneth.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Write one multi-instance job with 2 parameters PartitionCount and PartitionNumber like this:

OCI-->XFM-->SEQ (name like yourfile_#PartitionNumber#.dat)

Put a WHERE clause in to partition the data:

Code: Select all

MOD(pick and integer column like the primary key, #PartitionCount#) = #PartitionNumber# - 1
Run PartitionCount instances, set PartitionNumber from 1 to PartitionCount on each respective instance.

You'll get PartitionCount files. Concatenate them together using a unix cat statement:

Code: Select all

cat yourfiles_*.dat > yourfile.dat
Now, use sqlldr and DIRECT path truncate and load your target table. Put it all in a Sequence and run the extraction jobs Job stage simultaneously, then the Command stage to cat the files, then the Command stage to run sqlldr.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Another thought, something not quite as elegant or performant as Ken's but simpler to get to.

Don't use the OCI stage for millions of rows, especially if all you are doing are pure inserts. Swap out your target stage for the Oracle Bulk Loader stage set to 'Automatic' mode. You'll need to arrange for the actual truncation to happen in a separate step, however, as the stage itself doesn't support that. It could be another job, could be something done 'before job', several ways to skin that cat.

Might be worth a shot, as an academic exercise if nothing else. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply