Oracle OCI Load stage for Ora 9?

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

chrisk
Participant
Posts: 37
Joined: Wed Feb 26, 2003 6:20 pm

Oracle OCI Load stage for Ora 9?

Post by chrisk »

Hi,

We're using DS7.0 on HPUX. We installed all plug-ins at install time.

The properties of the "Oracle OCI Load" stage type when viewed in DS Manager are:

Short description: Oracle 8.1 Bulk Loader
Long description: Bulk loading Oracle databases using Oracle OCI 8.1

I've read the oraocibl.pdf (DataStage Oracle OCI Load) and it states that the ORAOCI Bulk Load stage loads data into Ora 8.1.

My question is, will this stage work with an Ora 9.2.0.3.0 database co-located on the DS7.0 HPUX server? Our Ora 9.2 database is down at the moment so I can't try. (I've sent this same question to Ascential support but thought I'd see if anyone in the field has tried).

Or, is there a ORAOCI Bulk Load stage designed for Ora 9 that didn't ship with DS7.0?

Thanks,
Chris.
Chris Kearns
Altis Consulting Pty Ltd
www.altis.com.au
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

All the bulk loader stage does is data type verify the column data against the metata (rather slowly I might add) and write you a control file. It writes an 8i control file, so doesn't take advantage of 9i or 10g features.

You have no version compatibilities to worry about, as you're moving up in versions. You will still have to execute a sqlldr session via a shell script anyway to load the data.

You're probably better served simply writing your own control file, as you have more flexibility and control. Using DataStage simply gives you a dynamic element so that the control file metadata is always derived off the current job design. But, you're still faced with that shell script, which is where your 9i features come into play.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
chrisk
Participant
Posts: 37
Joined: Wed Feb 26, 2003 6:20 pm

Post by chrisk »

Thanks for your reply Ken,

What you say is how I've understood previous versions of the bulk load stage to work. In fact I have only ever previously used the ORA bulk load stages to prove that it's faster to create the .DAT file manually via a sequential file stage!

The version of the "ORAOCI Bulk Load" stage that ships with DS7.0 now has a "Load Mode" property which can be set to "Automatic" which means the SQL Loader utility does not need to be invoked manually. I.e. the stage now bulk loads the data directly with no intervention.

Hence my questions.

Ta,
C.
Chris Kearns
Altis Consulting Pty Ltd
www.altis.com.au
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

chrisk wrote:The version of the "ORAOCI Bulk Load" stage that ships with DS7.0 now has a "Load Mode" property which can be set to "Automatic" which means the SQL Loader utility does not need to be invoked manually. I.e. the stage now bulk loads the data directly with no intervention.
Shows my ignorance of version 7. :oops:


But would it change your mind? If you know the pathetic performance the stage has doing (IMO) unecessary retyping of the data, and potentially introducing issues, version compatibilities, and other stupid stuff, why use this capability with all that baggage? I don't know if it will work with 9i, but how hard can this be to run sqlldr? Is it worth it?

Sorry to clog the pipe with my ignorance, I'm just not a fan of DataStage's integration with sqlldr. sqlldr is squirrely enough with direct path loads, it seems to not notice when it has problems.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
chrisk
Participant
Posts: 37
Joined: Wed Feb 26, 2003 6:20 pm

Post by chrisk »

I hear what you're saying. However, the PDF doco makes this version sound a little more s*xy so I am curious. If I had access to a working database I'd be able to give some results rather than bother everyone with these questions.

Ta,
C.
Chris Kearns
Altis Consulting Pty Ltd
www.altis.com.au
chrisk
Participant
Posts: 37
Joined: Wed Feb 26, 2003 6:20 pm

Post by chrisk »

Hmmm... I tried the DS7.0 ORAOCI Bulk Load stage using "Load Mode = Manual" so that it would just generate the CTL and DAT file. It processed a 260,000 row file (500 bytes wide) at 6,000 rows/sec.

Processing the same file to a sequential file stage ran at 29,000 rows/sec.

At face value it looks like the same overheads exist in the DS7.0 version of the ORAOCI Bulk Load stage.

C.
Chris Kearns
Altis Consulting Pty Ltd
www.altis.com.au
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Yes, but if you are transforming that data to a sequential file, and you run multiple job instances to distinct sequential files, say 5 copies, then you're processing at 150,000 rows/sec effectively. Then cat the files together and bulk load. Putting that bulk load stage into the job design eliminates instantiation capability.

That's 100% true if you're doing DIRECT path loading, as exclusive access to the table is required, unless doing named partition loading.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
chrisk
Participant
Posts: 37
Joined: Wed Feb 26, 2003 6:20 pm

Post by chrisk »

I was actually agreeing with you ;-) I.e. creating the DAT file with the bulk load stage is still slow in DS7.0. Therefore, using the sequential file stage is still better and doing groovy things like multiple instances of the job will only make it better still!

Ta,
C.
Chris Kearns
Altis Consulting Pty Ltd
www.altis.com.au
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

:idea: ... and you can use the bulk loader in PARALLEL mode for even faster loading!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
wdudek
Participant
Posts: 66
Joined: Mon Dec 08, 2003 10:44 am

Post by wdudek »

We are using bulk loader files created with a sequential file in our environment, but found that our network connection greatly slowed the bulk loader. To get around this we copy the file to the oracle server and created a stored procedure that invokes the sqlldr on the oracle box. Using this we can load 10 million records in 5 minutes. thought I'd throw that one out thee for everyone's benefit.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

wdudek wrote:We are using bulk loader files created with a sequential file in our environment, but found that our network connection greatly slowed the bulk loader. To get around this we copy the file to the oracle server and created a stored procedure that invokes the sqlldr on the oracle box. Using this we can load 10 million records in 5 minutes. thought I'd throw that one out thee for everyone's benefit.
Anytime you load via CONVENTIONAL you incur significant degradation due to all of the overhead regarding indexes, constraints, etc. If you are wanting to use DIRECT path, you must co-locate the load files with the instance. DIRECT path has the highest performant loading mechanisms because it bypasses indexes, constraints, etc. It has incremental update capability for indexes, so it's the fastest means for loading, as long as you move the load files onto that server.

If you want to make loading even faster, make sure that you divide the load file into separate files and load via DIRECT path into the named partition. Make sure you have locally managed indexes and no global indexes, and you will find that you can blast zillions of rows using this paradigm.

Word of advice, ETL developers must cross-over into the DBA (architecture, not administration) realm and understand table partitioning, as well as modeling, in order to design solutions for high volumes with fastest load times. Visit Tom Kyte to see what I'm talking about! http://www.asktom.oracle.com


Good luck!
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
nkln@you
Premium Member
Premium Member
Posts: 271
Joined: Wed Nov 17, 2004 5:15 am
Location: US

Post by nkln@you »

Can You Please explain in detail how its happeing? Actually i have never used Manual option in Ora Bulk Load. But i have understood this much that when i give it as manual, it creates a dat(data) file and a ctl(control) file. This dat file is nothing but the sequential file(correct me if i am wrong). Then you transferred this dat file to Oracle server. Can you Please explain this part. And after transferring, you Called SqlLdr which uses the created control file to move the data from this file to the table. Am i right?
Aim high
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

nkln@you wrote:This dat file is nothing but the sequential file(correct me if i am wrong).
It is *a* sequential file - fully transformed and load ready.
nkln@you also wrote:Then you transferred this dat file to Oracle server. Can you Please explain this part. And after transferring, you Called SqlLdr which uses the created control file to move the data from this file to the table. Am i right?
Short answer - yes. And you transfer both files if you are going to transfer anything at all.

I'm going to go out on a limb here and disagree ( :shock: ) with 2004 Ken on this one point - the assertation that you must co-locate the load files with the instance to do a direct path load. It may be a good idea - but it certainly isn't required.
-craig

"You can never have too many knives" -- Logan Nine Fingers
nkln@you
Premium Member
Premium Member
Posts: 271
Joined: Wed Nov 17, 2004 5:15 am
Location: US

Post by nkln@you »

Short answer - yes. And you transfer both files if you are going to transfer anything at all.

How do you transfer files? :)
I'm going to go out on a limb here and disagree ( :shock: ) with 2004 Ken on this one point - the assertation that you must co-locate the load files with the instance to do a direct path load. It may be a good idea - but it certainly isn't required.
i guess i was clear on this concept but you have me totally confused. Can you Explain the above words in a simpler language Please? :)

And one last thing... which path i shd use to load the dat(seq. files).. Oracle 7 Load, Oracle Oci Load, or seq. file stage? i have to load some 10 million records.
Aim high
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

How do you transfer files? Seriously? How would you do it if you didn't have DataStage? There's your answer.

:? Sorry, I didn't realize my posts were so confusing. Ok... let's see...

Ken stated that files to be DIRECT path bulk loaded must be on the same server that the database lives on. I disagreed. Better? :wink:

And lastly, you should use sqlldr. No stage.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply