Slow loading using DataStage with Teradata table as Target

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
rkumar28
Participant
Posts: 43
Joined: Tue Mar 30, 2004 9:39 am

Slow loading using DataStage with Teradata table as Target

Post by rkumar28 »

Hi,

I have a simple mapping on datastage. I am loading data from a flat file (with 19 columns with about 500,000 rows)into single Teradata table through a transformer. I don't have any native stage driver for Teradata, hence using the ODBC stage to connect to teradata database. But it is taking more than 12 hours to complete the loading(computing about 6 rows/sec). I have dropped the Secondary Index from the target table and increase the rows per transaction in the ODBC stage.
I am not able to drop the primary Index on the table. I believe we cannot drop primary Indexes in teradata.

Is there a way to Increase the performance of this simple loading?
I will really appreciate any help?

Thanks
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Have you ruled out using the bulkloading utilities available with Teradata (Multiload, etc)?
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
rkumar28
Participant
Posts: 43
Joined: Tue Mar 30, 2004 9:39 am

Post by rkumar28 »

Thanks for the quick response. Actually, I have never worked on the MLoad utility. Since the company I am working for has been using DataStage for sometime, so I am using the this ETL tool for this loading.

Is DataStage is not very effective with Teradata? I will be greatful if you can suggest anything with DataStage to make the load more efficient.

kcbland wrote:Have you ruled out using the bulkloading utilities available with Teradata (Multiload, etc)?
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

It has nothing to do with DataStage's efficiency. Every database has an optimal loading solution for data warehouse environments: the bulk loader. Teradata is specifically a data warehouse database. It is less a relational database and more a specific creature of the data warehouse world. Redbrick and Teradata are designed to be loaded using their bulk loader, while the ODBC driver makes everything "easier", it's not how the databases were "meant" to be loaded.

Without a doubt, you should read as much as you can about Teradata and decide for yourself. ODBC/OCI/ with DML based loading is pathetic, I don't care which database you are talking about. Oracle added DIRECT path loading to sqlldr to combat the inefficiencies with a DML based CONVENTIONAL load. Redbrick and Teradata suffer none of the burden of having to be backwards compatible with being a transactional database.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Post Reply