Page 1 of 1

How to load huge volumes of data in an Oracle DB? Bulk Load?

Posted: Fri Nov 23, 2007 3:46 pm
by Minhajuddin
Hi all,

What is the best way to do inserts in huge volumes?
a) Use the oracle bulk loader in a server job. or
b) Use of a parallel job on multiple nodes.

Which one of the above is fastest?

Thanks for all your valuable replies.

Posted: Fri Nov 23, 2007 8:24 pm
by ray.wurlod
Oracle Enterprise stage using direct write (effectively parallel bulk load).

Posted: Sat Nov 24, 2007 6:35 am
by Minhajuddin
Thanks for the reply Ray.

By direct write, are you referring to the Load method?

Re: How to load huge volumes of data in an Oracle DB? Bulk L

Posted: Sat Nov 24, 2007 8:45 am
by tkbharani
The best and fastest way to load is oracle sqlldr function(direct=true). In this you have many options like readsize,buffersize..through which you can load atleast a minimum of 3 crores in 40 minutes with Index. This also depends on rollback segements, unrecoverable options in ORACLE. This is my concern with ORACLE.

Re-building Index for large table will take time..

Posted: Mon Nov 26, 2007 12:45 am
by tkbharani
1. We are always loading the data into table for using/extraction,else there is no purpose in loading. if Index is there then while extraction from the table the query will execute faster.Without index query's are always slow in large table. Re-building Index for such a large tables will take more time than loading data.
2. The parameters can be obtained from oracle documents/web pages. Some of the parameters are as follows.
The following is a list of all the parameters that may be used in the command line:
USERID - Oracle username/password
CONTROL - Control file name
LOG - Log file name
BAD - Bad file name
DATA - Data file name
DISCARD - Discard file name
DISCARDMAX - Number of discards to allow (Default all)
SKIP - Number of logical records to skip (Default 0)
LOAD - Number of logical records to load (Default all)
ERRORS - Number of errors to allow (Default 50)
ROWS - Number of rows in conventional path bind array or between direct path data saves
(Default: Conventional Path 64, Direct path all)
BINDSIZE - Size of conventional path bind array in bytes (System-dependent default)
SILENT - Suppress messages during run (header, feedback, errors, discards, partitions, all)
DIRECT - Use direct path (Default FALSE)
PARFILE - Parameter file: name of file that contains parameter specifications
PARALLEL - Perform parallel load (Default FALSE)
FILE - File to allocate extents from

Posted: Mon Nov 26, 2007 1:24 am
by rleishman
If you don't want to rebuild the indexes, then you can still do DIRECT=TRUE PARALLEL=FALSE providing you have no referential integrity constraints.

If you do have RI constraints that you do not want to disable, then the fastest load method is also the slowest: conventional inserts. You can use an UPSERT method or a bulk load method with DIRECT=FALSE PARALLEL=FALSE - I have found the upsert marginally faster because it does not have re-convert all of the datatypes.

For the REALLY brave, the fastest way to load data into an indexed, RI enabled, PARTITIONED table is via Partition Exchange. With this method, you load your incoming data into an empty table using DIRECT=TRUE PARALLEL=TRUE, augment it with pre-existing rows from the target partition that were not present in the current load using DIRECT=TRUE PARALLEL=TRUE, enable indexes and RI, then exchange it with the target partition.

DataStage does not natively support Partition Exchange. It's tricky, and raises a heap of implementation-specific difficulties that are hard to solve.

You can specify alternate PARALLEL/DIRECT values in the $APT_ORACLE_LOAD_OPTIONS Environment Variable.

Posted: Wed Nov 28, 2007 1:30 pm
by Minhajuddin
Thanks for all you detailed answers.

I still have a problem figuring out how I can load data through a direct load.

a) Should I create the .par the .ctl and the data file and run some operating system commands to load the data through direct load.

or

b) Can we set all these parameters in the Oracle enterprise stage using Method = Load.

Posted: Wed Nov 28, 2007 2:53 pm
by ray.wurlod
Experiment. Change the write method to Load and investigate the available properties to add under each of the properties folders. You can thus answer your own question. You could even refer to the Chapter in the Parallel Job Developer's Guide that discusses the Oracle Enterprise stage.