Teradata throughput

bcarlson · Post by **bcarlson** » Tue Jul 29, 2008 10:26 pm

We have been having issues with throughput in Teradata using DataStage. With DB2, we can run between 5-10 jobs in parallel (depends on complexity) where data is loaded to or read from DB2. However, it seems that with comparable jobs that go against Teradata we can only run 1 or 2 at a time.

We know that we may have configuration issues or may not have the correct parameters set for our environment (sessionsperplayer, requestedsessions). We have been working with Teradata to figure out what may be going on.

One of the things they mentioned first is that our MFILES and T30FILES parameters in uvconfig are set unusually low. To be honest, I don't know if these were set "out of the box" or if someone set these intentionally. On production, our MFILES is set to 50 and T30FILES is set to 200. On dev, they are 12 and 1000 respectively.

I have no idea what typical values are, or if they are proportionate to the size of a system, what sizing numbers to use to set these parms. So I guess that is a question. Also, what exactly are these parms for? How could they affect how processes run? Or better yet, how many processes can run?

What is a good resource to learn about all the parms in uvconfig?

Thanks, I appreciate the help!

Brad.

ray.wurlod · Post by **ray.wurlod** » Tue Jul 29, 2008 11:43 pm

T30FILES is related only to dynamic hashed files, so can be discounted for parallel jobs (provided you're not running more than, say, 30 at a time). Increasing T30FILES enlarges a memory-based table in which sizing information about dynamic hashed files is stored; it's about 112 bytes per row, so increasing it can usually be effected without affecting anything else so much.

MFILES is the size of the "rotating file pool". DataStage Engine allows you to have as many files apparently open as you like, even though the underlying operating system does not. When you need to access a file that is not genuinely open, the least recently used open file is closed and the freed file unit made available for the now-required file, which is opened. Most UNIX systems have a kernel parameter called something like NOFILES, which is the number of files a user can have open at once. The correct value for MFILES is this value minus 8. (Eight file units are needed for reserved files, such as VOC and SYS.MESSAGE, in the DataStage environment.)

A good resource to learn about uvconfig is the documentation within the file itself. Then read the Administering UniVerse manual. The best of all is to attend the IBM UniVerse Theory and Practice class (formerly known as UniVerse Internals), but that may be a bit too much.

throbinson · Post by **throbinson** » Wed Jul 30, 2008 6:19 am

What is limiting you to only 1-2 Teradata jobs. Any more than that and they fail? What is the failure?
What is different between DB2 and Teradata?
1. A DB2 job does not have to be re-partitioned because Datastage is DB2 partition aware. Teradata to DataStage is not. Is it a re-partitioning problem or limitation of the Teradata jobs?
2. Teradata has a limited number of utility slots available and each FastExport/FastLoad Enterprise stage takes one. How many does your Teradata have and how many are needed for the test runs?