Page 1 of 1

TD API Load Performance

Posted: Mon Nov 09, 2009 10:01 am
by kishorenvkb
Hello Everyone,

I am reading from a text file into a transformer, where I am defaulting a couple of columns to null (SetNull()) and one other column to CurrentTimestamp() which then flows to Terada API stage to do Upserts. On any given day, the number of records in the text file are not more than 500 records.
On average, it takes about 3 to 5 hours in the step "Logging Delayed Metadata" and eventually gets to "Requesting Delayed Metadata"
I am looking at the run time stats in Teradata and there is no huge CPU are I/O consumtion for this job?
Having hard time to understand on where is it spinning its wheels for about 3 to 5 hours.
Please shed some light and let me know if you need more information on this.

Thanks

Posted: Mon Nov 09, 2009 10:18 am
by chulett
There are several other posts here that mention "Logging Delayed Metadata" and some are Resolved, so it might help to check those if you haven't already.

Posted: Mon Nov 09, 2009 10:21 am
by kishorenvkb
I have searched on the topic and none of them kind of answered my question. Please assist.

Posted: Mon Nov 09, 2009 10:34 am
by chulett
Ummm... that was assistance. So, none of those discussions on unbalanced AMPs or duplicates in the data are applicable to you? Have you involved your Teradata DBA as they suggested?

Posted: Mon Nov 09, 2009 4:00 pm
by kishorenvkb
Thanks for that assistance.

Yes, as stated I checked already on the distribution and unbalance AMP's. It is prettly evenly distributed as it can be.

The table is a set table and the API stage is doing the upserts based on the key.

Yes, I have been working with the DBA's on this one and it doest not seems to be a quiet obvious one.

Thanks

Posted: Mon Nov 09, 2009 9:50 pm
by chulett
OK. There are some Teradata folks here (I'm not) so hopefully one of them will wander by and be able to help. And there's always your official support provider to ping for help.

Posted: Thu Nov 12, 2009 3:08 am
by richdhan
Hi,

I hope you know the difference between a SET table and MULTISET table.

A SET table does not allow 2 rows to be exactly the same. So whenever an insert happens the duplicate row checking happens at the background.

How many columns does the table have? If there are too many columns then the duplicate row checking would take more time.

Why dont you convert the table structure to a MULTISET table and try running your job.

HTH
--Rich