Teradata load timings...

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
lamphic
Participant
Posts: 10
Joined: Mon May 10, 2004 8:52 am
Location: Des Moines, Iowa
Contact:

Teradata load timings...

Post by lamphic »

We are a divivsion of an Insurance company that is moving from UDB DB2 on UNIX to Teradata. This is because 2 Data Warehouse groups for 2 divisions are merging technologies. The other division has had Teradata but no Datastage. Our divsion has had Datastage for 4 years. The way that the other division has loaded data to the Teradata server is via MLoads, Bteqs, and Fast Loads running on the mainframe.

The other division is looking at the functionality of using DataStage to load directly to tables on Teradata, however they are thinking that FTPing data from the Unix box to a mainframe and then loading that data to Teradata from the mainframe is faster than having Datastage directly load the data to Teradata using Datastage's plug-ins.

Does anyone have experience with loading data via Datastage from a Unix box to a Teradata box? Does anyone have any thoughts that I might be able to use in a Proof of Concept I am working on?

We do not have parallel extender but could possibly install it if it would help increase the timing on these data loads to Teradata tables. A problem that we are facing on the Teradata server is that it will only allow 6 Multiloads to run at a time and we only have a certain window to run all of the multiloads we need to complete. That window is full, so we need to decrease the time it takes to run the loads.

Thank you
Specialist
Allied Information Systems
Data Warehouse Group
1100 Locust St Dept 0900
Des Moines, IA 50391-0900
phone (515) 508-4463
T42
Participant
Posts: 499
Joined: Thu Nov 11, 2004 6:45 pm

Post by T42 »

I do not know if EE have any direct loading mechanism for Teradata (I haven't used that database yet), so I can't really help you on this. (Basically, I do not know whether they install an EE client on the Teradata box itself, or use some global API call to it from the EE box.)

I would definitely recommend that you contact Ascential to get some numbers on this information. They do heavy research on this subject and should be able to provide you with some details.
l_homme_de_rio
Participant
Posts: 17
Joined: Fri Jul 30, 2004 1:16 am

Post by l_homme_de_rio »

PX and TD are great working together !!! it's probably the best configuration you can have. Performances are really impressive ! Obviously you will need some hints around PX/TD like timestamp support or # of multiload sessions.
Bteq (TD "PL/Sql") is really a nightmare to maintain.
PX is really THE "must have" conf within a TD environment
If you want more details ask me
All the best
tombastian
Premium Member
Premium Member
Posts: 41
Joined: Fri Jun 04, 2004 5:52 am
Location: Bangalore

Post by tombastian »

Hi,
Is Parallel Extender part of DataStage Enterprise Edition or Is it a seperate product?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Parallel Extender is a part of DataStage Enterprise Edition. :D
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
aartlett
Charter Member
Charter Member
Posts: 152
Joined: Fri Apr 23, 2004 6:44 pm
Location: Australia

Post by aartlett »

Just got onto this thread, hope I'm not too late :).

We are using teradata backend (NCR unix) from a solaris box. We mix multiload (mload) and direct data/stage writes depending on volume.

Multiload has a 20 second startup time, if the time to load the number of records is < 20 secionds on direct load, then direct load is obviously the way to go. if > 20 seconds then mload is the way. We found our breakeven point to be 1000 input records for upsert handling (select, insert else update).

We are going across a dedicated GB link between the datastage box and the teradata box. I'm not sure why you would upload to a mainframe to load to teradata as I thought teradata only ran on win32 and unix.

Talk to the NCR/Teradata DBA's, find a good friendly one who won't BS you.

Another thing, as you are running to a distributed database (across a network) is to try and get a dedicated link, GB ifpossible, and create lookup sets if volumes aren't to high. We are creating hash lookups (server) and lookup files (PX) when the ratio of avg input records:target lookup table is 1:10 or less as this greatly speeds up the lookup process. We create the lookups in a seprate job run in the sequencer. Use T/D fastexport for PX, make sure the utility is loaded first (had THAT problem on a deployment weekend).

Any problems or further discussion: Feel free to ask


Andrew
Post Reply