How to load huge volumes of data in an Oracle DB? Bulk Load?

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
Minhajuddin
Participant
Posts: 467
Joined: Tue Mar 20, 2007 6:36 am
Location: Chennai
Contact:

How to load huge volumes of data in an Oracle DB? Bulk Load?

Post by Minhajuddin »

Hi all,

What is the best way to do inserts in huge volumes?
a) Use the oracle bulk loader in a server job. or
b) Use of a parallel job on multiple nodes.

Which one of the above is fastest?

Thanks for all your valuable replies.
Minhajuddin

<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Oracle Enterprise stage using direct write (effectively parallel bulk load).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Minhajuddin
Participant
Posts: 467
Joined: Tue Mar 20, 2007 6:36 am
Location: Chennai
Contact:

Post by Minhajuddin »

Thanks for the reply Ray.

By direct write, are you referring to the Load method?
Minhajuddin

<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
tkbharani
Premium Member
Premium Member
Posts: 71
Joined: Wed Dec 27, 2006 8:12 am
Location: Sydney

Re: How to load huge volumes of data in an Oracle DB? Bulk L

Post by tkbharani »

The best and fastest way to load is oracle sqlldr function(direct=true). In this you have many options like readsize,buffersize..through which you can load atleast a minimum of 3 crores in 40 minutes with Index. This also depends on rollback segements, unrecoverable options in ORACLE. This is my concern with ORACLE.
Thanks, BK
tkbharani
Premium Member
Premium Member
Posts: 71
Joined: Wed Dec 27, 2006 8:12 am
Location: Sydney

Re-building Index for large table will take time..

Post by tkbharani »

1. We are always loading the data into table for using/extraction,else there is no purpose in loading. if Index is there then while extraction from the table the query will execute faster.Without index query's are always slow in large table. Re-building Index for such a large tables will take more time than loading data.
2. The parameters can be obtained from oracle documents/web pages. Some of the parameters are as follows.
The following is a list of all the parameters that may be used in the command line:
USERID - Oracle username/password
CONTROL - Control file name
LOG - Log file name
BAD - Bad file name
DATA - Data file name
DISCARD - Discard file name
DISCARDMAX - Number of discards to allow (Default all)
SKIP - Number of logical records to skip (Default 0)
LOAD - Number of logical records to load (Default all)
ERRORS - Number of errors to allow (Default 50)
ROWS - Number of rows in conventional path bind array or between direct path data saves
(Default: Conventional Path 64, Direct path all)
BINDSIZE - Size of conventional path bind array in bytes (System-dependent default)
SILENT - Suppress messages during run (header, feedback, errors, discards, partitions, all)
DIRECT - Use direct path (Default FALSE)
PARFILE - Parameter file: name of file that contains parameter specifications
PARALLEL - Perform parallel load (Default FALSE)
FILE - File to allocate extents from
Thanks, BK
rleishman
Premium Member
Premium Member
Posts: 252
Joined: Mon Sep 19, 2005 10:28 pm
Location: Melbourne, Australia
Contact:

Post by rleishman »

If you don't want to rebuild the indexes, then you can still do DIRECT=TRUE PARALLEL=FALSE providing you have no referential integrity constraints.

If you do have RI constraints that you do not want to disable, then the fastest load method is also the slowest: conventional inserts. You can use an UPSERT method or a bulk load method with DIRECT=FALSE PARALLEL=FALSE - I have found the upsert marginally faster because it does not have re-convert all of the datatypes.

For the REALLY brave, the fastest way to load data into an indexed, RI enabled, PARTITIONED table is via Partition Exchange. With this method, you load your incoming data into an empty table using DIRECT=TRUE PARALLEL=TRUE, augment it with pre-existing rows from the target partition that were not present in the current load using DIRECT=TRUE PARALLEL=TRUE, enable indexes and RI, then exchange it with the target partition.

DataStage does not natively support Partition Exchange. It's tricky, and raises a heap of implementation-specific difficulties that are hard to solve.

You can specify alternate PARALLEL/DIRECT values in the $APT_ORACLE_LOAD_OPTIONS Environment Variable.
Ross Leishman
Minhajuddin
Participant
Posts: 467
Joined: Tue Mar 20, 2007 6:36 am
Location: Chennai
Contact:

Post by Minhajuddin »

Thanks for all you detailed answers.

I still have a problem figuring out how I can load data through a direct load.

a) Should I create the .par the .ctl and the data file and run some operating system commands to load the data through direct load.

or

b) Can we set all these parameters in the Oracle enterprise stage using Method = Load.
Minhajuddin

<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Experiment. Change the write method to Load and investigate the available properties to add under each of the properties folders. You can thus answer your own question. You could even refer to the Chapter in the Parallel Job Developer's Guide that discusses the Oracle Enterprise stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply