Oracle Enterprise - Reading in parallel

rickrambo · Post by **rickrambo** » Mon Jun 30, 2008 4:19 pm

How do I READ ( Not write) in parallel using Oracle Enterprise stage.?

The "Execution Mode" tab is defaulted to Sequential and is greyed. How would I change this to Parallel?

ray.wurlod · Post by **ray.wurlod** » Mon Jun 30, 2008 4:52 pm

You don't (read in parallel) unless the Oracle table is a partitioned table.

rickrambo · Post by **rickrambo** » Tue Jul 01, 2008 7:13 am

Oracle table is partitinoned.

Now, how do I specify this table to be read in parallel? Are you suggesting that partitioned table will be read in parallel even though the "Executeion Mode" is sequential? If that is the case.....

Can I not read a partitioned table sequentially?

ray.wurlod · Post by **ray.wurlod** » Tue Jul 01, 2008 4:15 pm

You can read a partitioned table sequentially. What you can not do is read a non-partitioned table in parallel.

Look for the Partition Table property.

keshav0307 · Post by **keshav0307** » Tue Jul 01, 2008 9:30 pm

Ray,
do you mean that although the execution mode of the oracle enterprise stage is sequential, if the partiotion table property is specified then the job will read the table in parallel?

even after defining the partition table, The "Execution Mode" tab is defaulted to Sequential and is greyed

jasper · Post by **jasper** » Wed Jul 02, 2008 12:16 am

Ray,
I think your statement is incorrect.
When you have an Oracle stage in a parallel job there is an option for 'partition table'. By using this you can read in parallel.

How does it work: you give a tablename to this parameter. DS will look for the min and max rowid for this table and will add this in the where clause. So if the table would have min rowid 0 and max rowid 40 and you run with 4 nodes it will run 4 times the query with adding
'where rowid between 0 and 10', between 10 and 20',....

This has nothin to do with the table fysically being partitioned or not.

side-effects:
-if you do anything like sort or aggregate in your job you will need to repartition in the following stages
-it's based on the full table, if you have an extra where clause you could get very unbalanced results
-watch your explain plan, because of the rowid it will become a full tablescan on that table.

kirankota79 · Post by **kirankota79** » Wed Jul 02, 2008 9:51 am

for four nodes ...does it reduce the time read by 4 times if we use the partition table property? i want to write to a dataset and want to use the same dataset to reload back to the same table

hello105 · Post by **hello105** » Fri Jul 04, 2008 12:45 am

You are right.But if the sql has a where clause,when you use the partiton,the result will have repeated records.Maybe it is a bug.

jasper wrote:Ray,
I think your statement is incorrect.
When you have an Oracle stage in a parallel job there is an option for 'partition table'. By using this you can read in parallel.

How does it work: you give a tablename to this parameter. DS will look for the min and max rowid for this table and will add this in the where clause. So if the table would have min rowid 0 and max rowid 40 and you run with 4 nodes it will run 4 times the query with adding
'where rowid between 0 and 10', between 10 and 20',....

This has nothin to do with the table fysically being partitioned or not.

side-effects:
-if you do anything like sort or aggregate in your job you will need to repartition in the following stages
-it's based on the full table, if you have an extra where clause you could get very unbalanced results
-watch your explain plan, because of the rowid it will become a full tablescan on that table.

jreddy · Post by **jreddy** » Tue Jul 15, 2008 11:06 am

Yes hello105, you are right.
When i set the partition table option while using a user-defined SQL query that joins multiple tables, I end up with duplicates.
The data read becomes faster, but to avoid the duplicates if we add a remove dups stage, it slows down again negating the speed that i achieve with Partition Table option..

Is this really a bug or is there something that has to be set to avoid generating duplicates.

UPS · Post by **UPS** » Tue Jul 15, 2008 2:20 pm

I do not know if this will help but I use the modulus statement in the 'WHERE' clause to simulate a parallel read. If I want 4 degrees of parallelism I create 4 Oracle stages and use the mod function in the 'WHERE' clause. However, you should use it with a numerical key.
A hack I know but hey, it reads a non-partitioned table in parallel.
Example of 4 degrees of parallelism :
ORACLE_STAGE_1
WHERE mod( key_field, 4) = 0
ORACLE_STAGE_2
WHERE mod( key_field, 4) = 1
ORACLE_STAGE_3
WHERE mod( key_field, 4) = 2
ORACLE_STAGE_4
WHERE mod( key_field, 4) = 3

DSXchange

Oracle Enterprise - Reading in parallel

Oracle Enterprise - Reading in parallel

Reading an ORACLE non-partitioned table in parallel