Oracle Enterprise - Reading in parallel
Moderators: chulett, rschirm, roy
Oracle Enterprise - Reading in parallel
How do I READ ( Not write) in parallel using Oracle Enterprise stage.?
The "Execution Mode" tab is defaulted to Sequential and is greyed. How would I change this to Parallel?
The "Execution Mode" tab is defaulted to Sequential and is greyed. How would I change this to Parallel?
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Premium Member
- Posts: 783
- Joined: Mon Jan 16, 2006 10:17 pm
- Location: Sydney, Australia
Ray,
I think your statement is incorrect.
When you have an Oracle stage in a parallel job there is an option for 'partition table'. By using this you can read in parallel.
How does it work: you give a tablename to this parameter. DS will look for the min and max rowid for this table and will add this in the where clause. So if the table would have min rowid 0 and max rowid 40 and you run with 4 nodes it will run 4 times the query with adding
'where rowid between 0 and 10', between 10 and 20',....
This has nothin to do with the table fysically being partitioned or not.
side-effects:
-if you do anything like sort or aggregate in your job you will need to repartition in the following stages
-it's based on the full table, if you have an extra where clause you could get very unbalanced results
-watch your explain plan, because of the rowid it will become a full tablescan on that table.
I think your statement is incorrect.
When you have an Oracle stage in a parallel job there is an option for 'partition table'. By using this you can read in parallel.
How does it work: you give a tablename to this parameter. DS will look for the min and max rowid for this table and will add this in the where clause. So if the table would have min rowid 0 and max rowid 40 and you run with 4 nodes it will run 4 times the query with adding
'where rowid between 0 and 10', between 10 and 20',....
This has nothin to do with the table fysically being partitioned or not.
side-effects:
-if you do anything like sort or aggregate in your job you will need to repartition in the following stages
-it's based on the full table, if you have an extra where clause you could get very unbalanced results
-watch your explain plan, because of the rowid it will become a full tablescan on that table.
-
- Premium Member
- Posts: 315
- Joined: Tue Oct 31, 2006 3:38 pm
You are right.But if the sql has a where clause,when you use the partiton,the result will have repeated records.Maybe it is a bug.
jasper wrote:Ray,
I think your statement is incorrect.
When you have an Oracle stage in a parallel job there is an option for 'partition table'. By using this you can read in parallel.
How does it work: you give a tablename to this parameter. DS will look for the min and max rowid for this table and will add this in the where clause. So if the table would have min rowid 0 and max rowid 40 and you run with 4 nodes it will run 4 times the query with adding
'where rowid between 0 and 10', between 10 and 20',....
This has nothin to do with the table fysically being partitioned or not.
side-effects:
-if you do anything like sort or aggregate in your job you will need to repartition in the following stages
-it's based on the full table, if you have an extra where clause you could get very unbalanced results
-watch your explain plan, because of the rowid it will become a full tablescan on that table.
Yes hello105, you are right.
When i set the partition table option while using a user-defined SQL query that joins multiple tables, I end up with duplicates.
The data read becomes faster, but to avoid the duplicates if we add a remove dups stage, it slows down again negating the speed that i achieve with Partition Table option..
Is this really a bug or is there something that has to be set to avoid generating duplicates.
When i set the partition table option while using a user-defined SQL query that joins multiple tables, I end up with duplicates.
The data read becomes faster, but to avoid the duplicates if we add a remove dups stage, it slows down again negating the speed that i achieve with Partition Table option..
Is this really a bug or is there something that has to be set to avoid generating duplicates.
Reading an ORACLE non-partitioned table in parallel
I do not know if this will help but I use the modulus statement in the 'WHERE' clause to simulate a parallel read. If I want 4 degrees of parallelism I create 4 Oracle stages and use the mod function in the 'WHERE' clause. However, you should use it with a numerical key.
A hack I know but hey, it reads a non-partitioned table in parallel.
Example of 4 degrees of parallelism :
ORACLE_STAGE_1
WHERE mod( key_field, 4) = 0
ORACLE_STAGE_2
WHERE mod( key_field, 4) = 1
ORACLE_STAGE_3
WHERE mod( key_field, 4) = 2
ORACLE_STAGE_4
WHERE mod( key_field, 4) = 3
A hack I know but hey, it reads a non-partitioned table in parallel.
Example of 4 degrees of parallelism :
ORACLE_STAGE_1
WHERE mod( key_field, 4) = 0
ORACLE_STAGE_2
WHERE mod( key_field, 4) = 1
ORACLE_STAGE_3
WHERE mod( key_field, 4) = 2
ORACLE_STAGE_4
WHERE mod( key_field, 4) = 3