performance improving while fetching data from oracle stage

bhaskarjha · Post by **bhaskarjha** » Wed May 03, 2006 11:57 pm

hi
how can in i improve the performance while fetching the data from oracle , using ODBC stage. in input i m having 13 lacks rows , and looking for a speed near to 1000 rows/ sec. if i attempt to view the data i am getting error that metadata mismatch and followed by an 'data source empty' message , inspite of having data in table.

ray.wurlod · Post by **ray.wurlod** » Thu May 04, 2006 12:36 am

First off, get rid of the metadata mismatch. Import the Oracle table definition and load that (just the columns that you need) into your DataStage job. Even that will improve the speed a little.

For the rest, the problem is what you are doing with the rows once they arrive. Chances are that the problem is not extraction from Oracle. To prove this, create a job that extracts from Oracle and writes to a text file, performing no transformation. Get performance metrics for this job. This is the upper limit; any processing you add to it will reduce throughput.

Add transformation, keeping the text file for output. Performance metrics will now reveal the impact of transformation processing.

Any other slowness is necessarily the load phase. Split this out; use sqlldr (or other bulk loader if the target is not Oracle) to load from the text file into the target table. This will be faster than INSERT statements.

Consider making the job multi-instance, each instance processing a similar proportion of the source data. This will not yield much gain on a single-CPU machine, but may well do so on a multi-CPU machine.

If it is still too slow, consult with your DBA about how to improve the load performance (temporarily removing constraints, indexes, etc.) You have exhausted the possibilities in DataStage by now.

chulett · Post by **chulett** » Thu May 04, 2006 6:14 am

Another thing to consider would be to switch to the native OCI stage rather than using ODBC.

pkomalla · Post by **pkomalla** » Thu May 04, 2006 9:28 am

You can also use a hash file.

Load required columns from the oracle table into hash file.Enable preload memory option in hash file which helps in performance.

As the number of rows is 130,000 using hash file is not a big problem.

ray.wurlod · Post by **ray.wurlod** » Thu May 04, 2006 2:48 pm

<peeve>It's "hashed" file, please get it right! A hash file is used for filing hash.</peeve>

"Hashed" refers to the fact that keys are located using a hashing algorithm (rather than a table scan or primary key index).