PX job with Teradata Mload taking too long to execute

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
kausmone
Participant
Posts: 39
Joined: Fri Sep 21, 2007 1:47 am
Location: Prague

PX job with Teradata Mload taking too long to execute

Post by kausmone »

Hello,

I have a job that combines data from three datasets using a funnel, this merged data is then loaded into a teradata table using the Teradata Multiload stage. When I tried running the job, it took 8 hours to load 334,000 records, which is obviously way too long a time.

On checking the director logs, i noticed that the time lag between the below two messages was almost the entire runtime, i.e. 8 hours:

11/27/2007 5:50:19 PM Logging delayed metadata.
11/28/2007 1:53:23 AM Requesting delayed metadata

Not sure what these mean and whether they indicate the reason why the job took so long. I checked on the forum and found that various users have encountered these messages in various other stages, including FTP, DB2 etc. but no solution has been put forth.

Can anyone suggest what the problem might be, i.e. what possible reasons might make the job take this long for loading?

The other details of the job are as follows:
Funnel Stage uses "Continuous Funnel" type. From the job-monitor it is seen that all records are funneled in next to no time.

TDMLoadPX Stage:
Load Utility: MultiLoad
Limit Settings: Session Max 2
All other settings (Session Min, Error Limit, Checkpoint etc) set to zero

Any help greatly appreciated! :)

Thanks,
Kaus
DS_FocusGroup
Premium Member
Premium Member
Posts: 197
Joined: Sun Jul 15, 2007 11:45 pm
Location: Prague

Post by DS_FocusGroup »

I always receive this loggin delayed metadata with MLOAD in PX but jobs run fine, have a look at your report file maybe it contain some logs which will identify the problem, mainly teradata takes time if there are duplicate error or the incoming rows are not properly distrubuted. If all your 300K data goes to one AMP then its a problem but from my point of view it will still be able to insert the data quickly the 8 hours, how many Terdata Nodes are you using and how many AMPS.
kausmone
Participant
Posts: 39
Joined: Fri Sep 21, 2007 1:47 am
Location: Prague

Post by kausmone »

Hi there,

Thanks for your reply. Unfortunately, I'm a first timer, so I do not know how the specification of the nodes or AMPs is done. Could you guide me please?

Thanks,
kaus
DS_FocusGroup
Premium Member
Premium Member
Posts: 197
Joined: Sun Jul 15, 2007 11:45 pm
Location: Prague

Post by DS_FocusGroup »

Go to Teradata Adminstrator and right click on the target table after data is loaded into it, and then "Space By AMP" paste the result of that in this window and maybe i can help you on this issue, also let me know if Error tables were created while you are loading.
trokosz
Premium Member
Premium Member
Posts: 188
Joined: Thu Sep 16, 2004 6:38 pm
Contact:

Post by trokosz »

Teradata Mload Stage is not parallel but serial so you get serial performance.....You entering a serial process in a parallel Job so not surprised your performance suffers....If you want performance you need to move the Teradata EE Stage....
Govindarajan
Participant
Posts: 24
Joined: Mon Jul 12, 2004 10:16 am

Re: PX job with Teradata Mload taking too long to execute

Post by Govindarajan »

a) First Make sure that all your datasets are created with Parallel Option.
b) Make sure that all three datasets has same key and order before passing into funnel stage
c) Instead of Teradata MLoad use Teradata Enterprise stage
kausmone
Participant
Posts: 39
Joined: Fri Sep 21, 2007 1:47 am
Location: Prague

Post by kausmone »

Hello,

THanks for your thoughts. As predicted in the reply from the DS_FocusGroup ID above, an error table is being created and it shows that duplicate rows are being inserted. I'll try it with a remove-duplicates and hopefully it should be resolved.

Thanks,
Kaus
kausmone
Participant
Posts: 39
Joined: Fri Sep 21, 2007 1:47 am
Location: Prague

Post by kausmone »

Yep, works like a charm after removing duplicate rows. The same set of data (without the approx 13,000 duplicates, i.e.) got loaded in 1 minute 51 seconds.

Thanks..!! :)
Post Reply