Load same teradata table 2 times in a single job

Vidyut · Post by **Vidyut** » Tue Feb 15, 2011 9:00 am

Hi

Is there any way that i can load a same teradata table 2 times in a single job....
The reason i need this is that i want to load duplicates in the table and if I use Mload for all the records than it effect the performance...
So what m trying to do is load the remaining data using the Teradata Enterprise Stage and duplicates using Teradata Mload....

Any Suggestion

Thanks in Advance

jwiles · Post by **jwiles** » Tue Feb 15, 2011 9:06 am

Is there a constraint or other restriction which keeps you from loading all of the records using TD Enterprise? If the records are in separate streams and you must load the duplicates after the rest, one method could use a sequence funnel into a single TD Enterprise stage.

Or, you could use a sequence job as your single job point of view

Regards,

Vidyut · Post by **Vidyut** » Tue Feb 15, 2011 9:10 am

Actually i was having data in a single stream only....but Teradata Enterprise Stage Silently drop all the duplicates....
Thats y i separated the data into two different streams....one for faster load (around 95% of data) and one for slower load(duplicate records)

Vidyut · Post by **Vidyut** » Tue Feb 15, 2011 9:18 am

One more issue I forgot to mention is that I want to avoid to create one different job for loading the duplicates bcoz I have hundreds of jobs & sequencers already designed having the same issue, so adding a new job is the last option..

Thanks

ShaneMuir · Post by **ShaneMuir** » Tue Feb 15, 2011 10:15 am

Do you actually need to load the duplicates? Or are they actually updates?

Is there a reject link on the teradata stage? If so it might be possible to reject out to a lookup stage and join your updates at that point, before updating those records via another teradata stage.

All the inserts should still happen, there will be no rejects off the teradata stage. On the lookup set the records not found condition to reject, and then using the records off the lookups reject link to perform the update. The lookup should act as a holding point until such time as all the inserts have finished.

jwiles · Post by **jwiles** » Tue Feb 15, 2011 10:23 am

So the question comes back to: Does the table being loaded have a unique constraint of some sort (unique primary key, etc.)? TD Enterprise will be restricted to what constraints are in effect at the time of the load.

The TeraData Connector stage has an option to allow duplicates to be loaded.

Read the Connectivity Guide for TeraData databases. It's a PDF file included with the product and should be available through the Information Center installed on the Services Tier server and possibly on the client workstation as well.

Regards,

Vidyut · Post by **Vidyut** » Tue Feb 15, 2011 11:20 am

Thanks for the replies

I have millions of records and in which thr r some duplicates(few hundred)(complete duplicates i.e, all columns r same)
1) Issue with TD Enterprise Stage is it silently drops the duplicates
2) If i use TD Mload the performance is a big issue for me
3) If I use TD connector then again if i use fastload the it drops the duplicates silently or else the performance is the issue if use it otherwise

Thanks

jwiles · Post by **jwiles** » Tue Feb 15, 2011 11:51 am

And you have yet to provide an answer to my question

Vidyut · Post by **Vidyut** » Tue Feb 15, 2011 11:54 am

Nopes...no key defined

cwong · Post by **cwong** » Tue Feb 15, 2011 12:42 pm

Suggestion is that you may google to read up about "Multiset" table in teradata.

jwiles · Post by **jwiles** » Tue Feb 15, 2011 1:06 pm

Ok. It may be a limitation of the TD Enterprise stage. I don't see an option for allowing duplicate rows listed in the connectivity guide section on TD Enterprise. Maybe someone else on the forum has more information...I'm pretty much at my TD knowledge limit here

Regards,

Vidyut · Post by **Vidyut** » Wed Feb 16, 2011 2:02 am

Ok jwiles...
thanks all for ur inputs....
still trying to figure out a way

Cwong: We can insert using Mload but in my case the performance is a very big factor.....So what I am wondering if it is possible to load a same table 2 times in a single job using two different stages....Thanks

If any one has any more suggestion please post

Thanks

mansoor_nb · Post by **mansoor_nb** » Thu Feb 17, 2011 2:33 am

As you said the duplicate records are very minimal (in hundereds), identify these duplicate records and seperate it with the main stream. Using the link ordering execute the link which has only duplicate records and then the other one. Something like below.

Source ------> Identify Unique/Duplicates --- Link 1 -----> Load Duplicate
|
| Link 2
|
Load Unique records

By doing this, there are chances of getting a DB deadlocks, you need to handle the deadlocks

Or the other option is to have the file created and to have all the duplicate records at the end of the unique records. Write a BTEQ query to load the data into the TD environment.

Vidyut · Post by **Vidyut** » Thu Feb 17, 2011 2:45 am

Hi mansoor

The first option creates deadlock and I couldnt find a way out....thats y i posted this

Can u please explain a little abt the second option...Thanks in advance

ray.wurlod · Post by **ray.wurlod** » Thu Feb 17, 2011 4:32 am

Not "u", but "you". Not "abt", but "about".

U was a city in ancient Babylonia. Abt was an engineer who invented a particular style of cogged railway for steep ascents and descents.