Slow loading using DataStage with Teradata table as Target

rkumar28 · Post by **rkumar28** » Fri Apr 09, 2004 12:43 pm

Hi,

I have a simple mapping on datastage. I am loading data from a flat file (with 19 columns with about 500,000 rows)into single Teradata table through a transformer. I don't have any native stage driver for Teradata, hence using the ODBC stage to connect to teradata database. But it is taking more than 12 hours to complete the loading(computing about 6 rows/sec). I have dropped the Secondary Index from the target table and increase the rows per transaction in the ODBC stage.
I am not able to drop the primary Index on the table. I believe we cannot drop primary Indexes in teradata.

Is there a way to Increase the performance of this simple loading?
I will really appreciate any help?

Thanks

kcbland · Post by **kcbland** » Fri Apr 09, 2004 1:06 pm

Have you ruled out using the bulkloading utilities available with Teradata (Multiload, etc)?

rkumar28 · Post by **rkumar28** » Fri Apr 09, 2004 2:06 pm

Thanks for the quick response. Actually, I have never worked on the MLoad utility. Since the company I am working for has been using DataStage for sometime, so I am using the this ETL tool for this loading.

Is DataStage is not very effective with Teradata? I will be greatful if you can suggest anything with DataStage to make the load more efficient.

kcbland wrote:Have you ruled out using the bulkloading utilities available with Teradata (Multiload, etc)?

kcbland · Post by **kcbland** » Fri Apr 09, 2004 3:07 pm

It has nothing to do with DataStage's efficiency. Every database has an optimal loading solution for data warehouse environments: the bulk loader. Teradata is specifically a data warehouse database. It is less a relational database and more a specific creature of the data warehouse world. Redbrick and Teradata are designed to be loaded using their bulk loader, while the ODBC driver makes everything "easier", it's not how the databases were "meant" to be loaded.

Without a doubt, you should read as much as you can about Teradata and decide for yourself. ODBC/OCI/ with DML based loading is pathetic, I don't care which database you are talking about. Oracle added DIRECT path loading to sqlldr to combat the inefficiencies with a DML based CONVENTIONAL load. Redbrick and Teradata suffer none of the burden of having to be backwards compatible with being a transactional database.