How to load data from oracle partitioned table in Px.

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
srini_ramesh
Participant
Posts: 13
Joined: Fri Oct 08, 2004 6:19 am

How to load data from oracle partitioned table in Px.

Post by srini_ramesh »

Hi,
I have to load data from a range partitioned oracle table, partitioned by date on weely basis. My target table is a non partitioned oracle table. I need to know what oracle stage settings do i need to make to load the data from this partitioned table successfully. The source table contains around 66 million records. My concern is whether the job will be able to load this volume of data successfully or not. I have seen that the ORAENTER stage offers a host of partition settings. However, I am not very sure which setting will be best to make in the source and target stages. Is it ok if i leave everything to default setting. Or should I specify some setting. What does preserve partioning in the advanced tab of stage property do? And what is the "partition table" option in the source oraenter stage meant for?
T42
Participant
Posts: 499
Joined: Thu Nov 11, 2004 6:45 pm

Post by T42 »

Whatever happens on the Oracle database is (usually) independent of what you do within DataStage. The fact that the Oracle database table may be partitioned or not is not something DataStage concern themselves with at this time.

In other words, just pull everything from the Oracle Input and write it to Oracle Output, and it'll work.

What does the "Partition Table" option means? It means that DataStage will split itself up in as many nodes as you set in the configuration file, and each one of them go out to the Oracle table and pull data based on basic range lookup value on row counts (I believe -- not sure exactly). It is a faster form of pulling data. However, performance can really hurt on Oracle's partitioned tables.

As far as I know, Oracle and Ascential are not in a partnership where they share enough information for Ascential to utilize Oracle's partitioning. At least not on the same level as Ascential and DB2 does.
dsxuserrio
Participant
Posts: 82
Joined: Thu Dec 02, 2004 10:27 pm
Location: INDIA

Post by dsxuserrio »

Srini
Is the target table out of your DW/DM?? While designing you would have addressed the partitioinging issues. I am wondering how come the target is not partitioned according to your needs.
Thanks
dsxuserrio
dsxuserrio

Kannan.N
Bangalore,INDIA
GIDs
Participant
Posts: 16
Joined: Sun May 23, 2004 2:39 pm
Location: San Francisco

Re: How to load data from oracle partitioned table in Px.

Post by GIDs »

srini_ramesh wrote:Hi,
I have to load data from a range partitioned oracle table, partitioned by date on weely basis. My target table is a non partitioned oracle table. I need to know what oracle stage settings do i need to make to load the data from this partitioned table successfully. The source table contains around 66 million records. My concern is whether the job will be able to load this volume of data successfully or not. I have seen that the ORAENTER stage offers a host of partition settings. However, I am not very sure which setting will be best to make in the source and target stages. Is it ok if i leave everything to default setting. Or should I specify some setting. What does preserve partioning in the advanced tab of stage property do? And what is the "partition table" option in the source oraenter stage meant for?
Definitely, PX will be able to handle this volume (very easily, indeed!!)

Is the source just one table? or is it a join?? if it just one table, set the 'partition table' to the source table name, what this does is that per your configuration file (lets say 4 nodes), PX engine accesses the oracle db_extents and db_segments and other system tables and based on a ROWID range (ROWID is the unique number internally generated by Oracle for each record in the table), PX splits the table 'read' into 4 partitions, please note, this is irrespective of the partitioning methodology you would have used for the source table. Depending on your CPU capacity, you can increase the number of nodes in your configuration file. To answer your next question on 'Preserve partitioning', this basically tells PX, to maintain the same partitioning as data moves along to the next stage, this has nothing to do with the source table.
Post Reply