TD API Load Performance

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
kishorenvkb
Participant
Posts: 54
Joined: Mon Dec 24, 2007 9:27 am

TD API Load Performance

Post by kishorenvkb »

Hello Everyone,

I am reading from a text file into a transformer, where I am defaulting a couple of columns to null (SetNull()) and one other column to CurrentTimestamp() which then flows to Terada API stage to do Upserts. On any given day, the number of records in the text file are not more than 500 records.
On average, it takes about 3 to 5 hours in the step "Logging Delayed Metadata" and eventually gets to "Requesting Delayed Metadata"
I am looking at the run time stats in Teradata and there is no huge CPU are I/O consumtion for this job?
Having hard time to understand on where is it spinning its wheels for about 3 to 5 hours.
Please shed some light and let me know if you need more information on this.

Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

There are several other posts here that mention "Logging Delayed Metadata" and some are Resolved, so it might help to check those if you haven't already.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kishorenvkb
Participant
Posts: 54
Joined: Mon Dec 24, 2007 9:27 am

Post by kishorenvkb »

I have searched on the topic and none of them kind of answered my question. Please assist.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Ummm... that was assistance. So, none of those discussions on unbalanced AMPs or duplicates in the data are applicable to you? Have you involved your Teradata DBA as they suggested?
-craig

"You can never have too many knives" -- Logan Nine Fingers
kishorenvkb
Participant
Posts: 54
Joined: Mon Dec 24, 2007 9:27 am

Post by kishorenvkb »

Thanks for that assistance.

Yes, as stated I checked already on the distribution and unbalance AMP's. It is prettly evenly distributed as it can be.

The table is a set table and the API stage is doing the upserts based on the key.

Yes, I have been working with the DBA's on this one and it doest not seems to be a quiet obvious one.

Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

OK. There are some Teradata folks here (I'm not) so hopefully one of them will wander by and be able to help. And there's always your official support provider to ping for help.
-craig

"You can never have too many knives" -- Logan Nine Fingers
richdhan
Premium Member
Premium Member
Posts: 364
Joined: Thu Feb 12, 2004 12:24 am

Post by richdhan »

Hi,

I hope you know the difference between a SET table and MULTISET table.

A SET table does not allow 2 rows to be exactly the same. So whenever an insert happens the duplicate row checking happens at the background.

How many columns does the table have? If there are too many columns then the duplicate row checking would take more time.

Why dont you convert the table structure to a MULTISET table and try running your job.

HTH
--Rich
Post Reply