Running Oracle connector in parallel mode
Moderators: chulett, rschirm, roy
Running Oracle connector in parallel mode
Hi,
I am trying to extract data from oracle table using oracle connector.
execution mode is default parallel.
When i am setting Enable partitioned reads option yes and giving partitioning read method as rowid hash, sql giving more no of records rather than if i keep Enable partitioned reads option as no. (Also i queried the DB directly and its matching with the record count if Enable partitioned reads option is No)
My intention is to run this stage parallel for better performance.
Please suggest how can i do this.
I am trying to extract data from oracle table using oracle connector.
execution mode is default parallel.
When i am setting Enable partitioned reads option yes and giving partitioning read method as rowid hash, sql giving more no of records rather than if i keep Enable partitioned reads option as no. (Also i queried the DB directly and its matching with the record count if Enable partitioned reads option is No)
My intention is to run this stage parallel for better performance.
Please suggest how can i do this.
Thanks,
Akarsh Kapoor
Akarsh Kapoor
Hi
I tried using the parallel reads option and it actually reduced the rows per second from 11k to 8k. I suspect this is because DS inserts an additional where clause to do the partitioning which simply slows down the processing.
In addition, because I was using my own sql statement with table aliases, I had to remove the aliases as it conflicted with the inserted where clause.
No problem with auto generated sql statements.
I also got double the number of rows on a look up link but I'm not sure why!
Anyway, I would be interected to know how you get on with performance.
Hope this helps,
Bob.
I tried using the parallel reads option and it actually reduced the rows per second from 11k to 8k. I suspect this is because DS inserts an additional where clause to do the partitioning which simply slows down the processing.
In addition, because I was using my own sql statement with table aliases, I had to remove the aliases as it conflicted with the inserted where clause.
No problem with auto generated sql statements.
I also got double the number of rows on a look up link but I'm not sure why!
Anyway, I would be interected to know how you get on with performance.
Hope this helps,
Bob.
Bob Oxtoby
Running Oracle connector in parallel mode
Hi All,
Still help is needed.
Problem is i have 90M record in my oracle table. While extracting it is reading appx 10K record per sec and taking 2.5 hrs. Please suggest how can i increase extraction rate.
Still help is needed.
Problem is i have 90M record in my oracle table. While extracting it is reading appx 10K record per sec and taking 2.5 hrs. Please suggest how can i increase extraction rate.
Thanks,
Akarsh Kapoor
Akarsh Kapoor
Hi,
I think I would start with oracle itself.
If you run your sql statement in an oracle IDE does it still take 2.5 hours to complete?
Is the table partioned? (A table with 90m rows is a good candidate for partioning!)
Have the stats been run on the table recently? (Your dba will know) Poor stats will slow things done significantly.
Cheers,
Bob.
I think I would start with oracle itself.
If you run your sql statement in an oracle IDE does it still take 2.5 hours to complete?
Is the table partioned? (A table with 90m rows is a good candidate for partioning!)
Have the stats been run on the table recently? (Your dba will know) Poor stats will slow things done significantly.
Cheers,
Bob.
Bob Oxtoby
Running Oracle connector in parallel mode
Hi Chullet,
I am just using the below query
select
Field1
from Table
And taking distinct value using sort stage as its huge data.
Job is first reading data from Oracle stage and then passing to sort stage and not using pipelining.
Also please tell me where to use @FALSE constraint.
I am just using the below query
select
Field1
from Table
And taking distinct value using sort stage as its huge data.
Job is first reading data from Oracle stage and then passing to sort stage and not using pipelining.
Also please tell me where to use @FALSE constraint.
Thanks,
Akarsh Kapoor
Akarsh Kapoor
Hi,
Try changing your sql to:
select DISTINCT
Field1
from Table
This will considerably reduce the amount of data coming from the database server which alone should speed things up.
This should also remove the need for the sort stage which again should speed things up.
Hope this help,
Bob.
Try changing your sql to:
select DISTINCT
Field1
from Table
This will considerably reduce the amount of data coming from the database server which alone should speed things up.
This should also remove the need for the sort stage which again should speed things up.
Hope this help,
Bob.
Bob Oxtoby
Running Oracle connector in parallel mode
Hi BOB,
Earlier i was using DISTINCT in query itself but was taking more time. So i am doing it in DS itself.
Earlier i was using DISTINCT in query itself but was taking more time. So i am doing it in DS itself.
Thanks,
Akarsh Kapoor
Akarsh Kapoor
Re: Running Oracle connector in parallel mode
Hi BOB,
I added DISTINCT in query and also changed some settings as below
Array Size - 40000
PreFetch Row Count - 40000
PreFetch Buffer - 4096
Now job is taking just 4.5 mins, but results are not matching
In DS
select DISTINCT
Field1
from Table
giving o/p as 573684 records
In Oracle DB
select DISTINCT
Field1
from Table
giving o/p as
1022779
Not sure why such huge difference.
I added DISTINCT in query and also changed some settings as below
Array Size - 40000
PreFetch Row Count - 40000
PreFetch Buffer - 4096
Now job is taking just 4.5 mins, but results are not matching
In DS
select DISTINCT
Field1
from Table
giving o/p as 573684 records
In Oracle DB
select DISTINCT
Field1
from Table
giving o/p as
1022779
Not sure why such huge difference.
Thanks,
Akarsh Kapoor
Akarsh Kapoor
Re: Running Oracle connector in parallel mode
Constraints are an element of Transformer output links. For the record, you could have added one before the Sort stage and set that value there to gauge how long the source portion of the job consumed. The remainder would have been either the sorting or the target writes and (if needed) you could continue to play the same game to isolate each.akarsh wrote:please tell me where to use @FALSE constraint.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
It may speed up processing, maybe even considerably. But as always, it... depends. On many things. SMP or MPP architecture, I/O bandwidth, available CPUs and memory for sorting / hashing / I/O buffers for example are needed. And generally it doesn't help a simple select like this one. Never mind that Oracle will happily ignore the request if that number of parallel server processes are not available.
And just to clarify, the hint is for the degree of parallelism, four parallel server processses are created which has nothing to do with the number of processors.
And just to clarify, the hint is for the degree of parallelism, four parallel server processses are created which has nothing to do with the number of processors.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers