Oracle Connector - Read in Parallel

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
nagarjuna
Premium Member
Premium Member
Posts: 533
Joined: Fri Jun 27, 2008 9:11 pm
Location: Chicago

Oracle Connector - Read in Parallel

Post by nagarjuna »

Hi ,

I am using the connectors in Datastage 8.5 version . I am just checking different options in the connectors . I am using Oracle connector to read a full table in parallel & write to dataset . The table contains nearly 100 million records . I checked the documentation & found that we have
following options to read data in parallel ..
1) Rowid Hash uses rowid , ora_hash , Mod functions

2) Rowid Range uses rowid

3) Oracle partitions uses oracle partitions on the table

4) Modulus uses a column & mod function

5) Min Max range Uses a columns & finds the range

6) Rowid Round Robin uses Mod & Row_id_row_number functions

Here the table from which I am unloading is not partitioned & I am unloading all columns . Trying to figure out , what is the best method to use ?

According my requirement , 3rd & 4th are not the options . Can somebody explain the best method to read in this scenario ? I have limited access to check the functionality of all the above options .

Please do let me know if the requirement is not clear .

Thanks in advance .
Nag
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The requirement is not clear.

You can read about these Oracle partitioning mechanisms in Oracle manuals or on the internet.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
nagarjuna
Premium Member
Premium Member
Posts: 533
Joined: Fri Jun 27, 2008 9:11 pm
Location: Chicago

Post by nagarjuna »

Thanks Ray for your response .

In short , My question is : What is the preferred method ( Rowhash vs RowRange ) for extracting data in parallel ?Which one gives a good performance ? I don't have indexes on the table .It's a simple query ( Select all columns from table) & table is not partitioned .
Nag
nagarjuna
Premium Member
Premium Member
Posts: 533
Joined: Fri Jun 27, 2008 9:11 pm
Location: Chicago

Post by nagarjuna »

I searched in internet & I got very limited info about these . I am not so sure if these ( Row Range , Row Hash ) are related to oracle partitioning methods .
Any inputs would greatly appreciated .
Nag
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

They're definitely not DataStage partitioning algorithms.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

They're definitely Oracle partitioning algorithms.
-craig

"You can never have too many knives" -- Logan Nine Fingers
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

The 6 options listed are choices in the drop-down menu of the Oracle Connector stage property "Partitioned reads method" which is under the "Enable partitioned reads" Yes/No option.

These are documented in the Connectivity Guide for Oracle Databases PDF that should be on your client. If not, it's on the IBM web site.
Support for partitioned read methods:
The connector supports these partitioned read methods: Rowid range, Rowid round robin, Rowid hash, Modulus, Minimum and maximum range, and Oracle partitions.

For all partitioned read methods except the Oracle partitions method, the connector modifies the WHERE clause in the specified SELECT statement. If the WHERE clause is not included in the specified SELECT statement, the connector adds a WHERE clause. For the Oracle partitions method, the connector modifies the specified SELECT statement by adding a PARTITON(partition_name) clause. When the specified SELECT statement contains subqueries, the connector modifies the first SELECT...FROM subquery in the SELECT statement.
It goes on to describe each method...

I doubt any of this applies in your case because your table is not partitioned in Oracle to begin with.
Choose a job you love, and you will never have to work a day in your life. - Confucius
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Ding ding ding! :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

Keep your settings as follows:
Partition read method :Rowid Range
Increase your array size and row count 20000 and 40000.

and check it.

as per the server idle ratio, increase the count of array size and row count.
do not enter any value in below two checks.

1>Table name for partition reads.
2>Partition or sub partition name for partition read

Thanks
Thanx and Regards,
ETL User
nagarjuna
Premium Member
Premium Member
Posts: 533
Joined: Fri Jun 27, 2008 9:11 pm
Location: Chicago

Post by nagarjuna »

Thanks Eric for the response . I read the documentation .Howver , I am not able to figure it out which option is better RowHash or RowRange .The table may not be partitioned but those options are still applicable in my case . I tried with Row Hash & it is firing 4 parallel queries as I am running on 4 nodes .
Nag
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

You're welcome!

How you partition before landing data in the dataset should depend on what you need to do with the data in subsequent jobs.
Choose a job you love, and you will never have to work a day in your life. - Confucius
Post Reply