Page 1 of 1

Parallel Extender usage

Posted: Thu Dec 09, 2004 9:14 am
by sathyanveshi
Hi,

I need to load a Oracle table of 20 million rows to another Oracle table. My server is a 4 CPU machine. May I know how the use of Parallel Extender (or Enterrpise Edition) will be advantageous? I need to know how the rows are processed to impress upon the need for using the Enterprise Edition.

Cheers,
Mohan

Posted: Thu Dec 09, 2004 11:48 am
by nkumar_home
PX is real fast but depends on how many processors you have set your PX config file to use. It tends to take over a majority of the CPU utilization for those processors. If you are using PX with just 4 processors then you should design your batch so that relatively very few jobs run during the PX jobs. Have you tried partiton / parallel inserts directly using sql. You can put the sql in the "before sql" portion of the datastage job.

We have found PX very useful for insert/update jobs involving much more than 20 millions rows. We have about 20 PX jobs out of 300 total datastage jobs but we have the luxury a HP superdome with 24 processors on production.

Re: Parallel Extender usage

Posted: Thu Dec 09, 2004 1:33 pm
by sathyanveshi
Hi,

Thanks for the response. I need to know the approximate time taken to load the say 10 GB of data using Parallel Extender with 8 CPUs? I also need to know the technical details of how Parallel Extender executes a job. What does it do when it has to run a job that loads 20 million rows of data using 8 CPUs? Can you please answer all the above questions?

Cheers,
Mohan

Posted: Thu Dec 09, 2004 3:21 pm
by ray.wurlod
You can not estimate time without knowing how large the rows are and whether there are other factors influencing throughput such as network bandwidth, number of Oracle listeners, parallelism within the query, other locking operations on the target table and so on.

This is a "how long is a piece of string?" question.

Once you've run it, you'll be able to answer the question, but only for the exact conditions under which that run was performed.

Some technical details are to be found in the Parallel Job Developer's Guide. Execution model of the engine is only found in Orchestrate documentation, which is no longer available.

Posted: Mon Dec 13, 2004 5:28 pm
by T42
It also depends on what you want to do with the data between the read and load.

It also depends on how much memory you have.

It also depends on how fast your disk I/O is.

It also depends on how fast your network I/O is.

PX will use whatever that is provided for it. But for us to give you some hard numbers would either entice you to unduely high expectations, or low expectations based on what you have and is willing to invest in.

Posted: Tue Dec 14, 2004 3:11 am
by sathyanveshi
Hi,

We are doing a table to table load....it's a direct load from a source table to a target table. There are no transformations in the interim. Also, the target table has no database constraints like Primary Key and Not Null. But the only thing is that we are embarking on a table to table load instead of table to sequential file to table. Our production environment is on Sun Solaris 15K, 16 GB memory, 8 CPU server, 548 SAN, 1 Gbit Fibre optic.

Given the above, can you tell me the average time taken to load 5 GB of data.

Have a ncie day...

Cheers,
Mohan

Posted: Tue Dec 14, 2004 2:10 pm
by ray.wurlod
No.

There are too many variables, such as row size, that you haven't revealed.

You won't do it in seconds, but it won't take days either.

Posted: Wed Dec 15, 2004 3:33 am
by sathyanveshi
Hi Ray and T42,

Let me know the variables/parameters that you need and I shall provide them. I need some approximate estimates.

Cheers,
Mohan

Posted: Sun Dec 19, 2004 12:29 pm
by T42
Also, where are the databases?

To be honest, if both tables are on the same schema, on the same server, and that server is separate from the PX server, it would definitely be advantageous to do a simple SQL query to handle the table-to-table load.

Just throw together a job and run the test on your own. Calculate the number of bytes being thrown around, and investigate the type of network/disk you have, and its I/O rate. Observe the rates using an investigation tool provided by your operating system (or third party).

Not only will you learn a lot about your system, you also will know whether or not PX is being optimized for your system. We can't do that for you without being contracted.

Sorry, I'm not available.

Posted: Sun Dec 19, 2004 2:41 pm
by ray.wurlod
I'm fully booked through to March.

What you're seeking, and what we've made clear from our answers thus far to be a very difficult task to perform and requiring expert analysis, is not something that you would ordinarily expect to get for free. Would someone (some expert) volunteer to tune your database configuration?