Setting a Teradata primary index for a write replace stage

bcarlson · Post by **bcarlson** » Tue Apr 15, 2008 8:30 pm

I am using a Teradata Enterprise stage and I want to explicitly state the primary index to use when creating a target table (using the replace option). In DB2, I would use the syntax 'key=mykey1,mykey2'. Does this option work for Teradata as well? If so, what is the syntax? Is it any different than DB2? Is there documentation for it?

Note - I have not done an exhaustive search on DSXChange or in the documentation, but I have searched.

Thanks!

Brad

bcarlson · Post by **bcarlson** » Tue Apr 15, 2008 8:31 pm

By the way, the reason I need this is because my input dataset has a 3-field natural key. The first 2 will not partition well. By default, if I do not specify a primary index (aka partitioning key), then DataStage will use the first field in the dataset as the key. Since that field will not distribute well, I need to use a different field and need to explicitly state in the dboptions.

Brad.

bcarlson · Post by **bcarlson** » Tue Apr 15, 2008 9:55 pm

Okay, answering my own question. It is late and my coffee has run out. After digging a bit more in the documentation I found the primary index option right in front of my nose. Not sure how I missed it.

Anyway, thank you for sharing in my brain fart! Problem solved (drink more coffee before thinking).

Brad.

ray.wurlod · Post by **ray.wurlod** » Tue Apr 15, 2008 9:57 pm

So, coffee is a brain laxative? Or is it lack of coffee?

bcarlson · Post by **bcarlson** » Wed Apr 16, 2008 12:28 pm

Beans beans the magical fruit, the more you drink the more you ... think, so lets have beans for every drink!

Tariq1981 · Post by **Tariq1981** » Fri Jan 23, 2009 4:03 pm

bcarlson wrote:Okay, answering my own question. It is late and my coffee has run out. After digging a bit more in the documentation I found the primary index option right in front of my nose. Not sure how I missed it.

Anyway, thank you for sharing in my brain fart! Problem solved (drink more coffee before thinking).

Brad.

I want to define those keys as unique primary index. Teradata EE define them as primary index only which is not unique. Anyone can help me please?

throbinson · Post by **throbinson** » Sun Jan 25, 2009 1:08 pm

How about a Unique Secondary Index definition in the Close Command? Why the requirement for a UPI?

chulett · Post by **chulett** » Sun Jan 25, 2009 3:16 pm

Hmmm... isn't a "primary" index by definition unique?

throbinson · Post by **throbinson** » Sun Jan 25, 2009 4:13 pm

You mean a Primary Key. A Primary Index is how the data gets hashed across AMPS in a Teradata DBMS. They can be two very different things.

Tariq1981 · Post by **Tariq1981** » Sun Jan 25, 2009 4:25 pm

throbinson wrote:You mean a Primary Key. A Primary Index is how the data gets hashed across AMPS in a Teradata DBMS. They can be two very different things.

Yah Right. What I know that UPI give us some performance advantages over the PI. Please correct me If I am wrong. So, we want to define those keys as UPI.

bcarlson · Post by **bcarlson** » Mon Jan 26, 2009 9:40 am

throbinson wrote:You mean a Primary Key. A Primary Index is how the data gets hashed across AMPS in a Teradata DBMS. They can be two very different things.

The Pimary Index is indeed how data is hashed across the amps, but a Unique Primary Index gives the same benefit (and better performance) as a primary key. Most of the TD Masters and DBA's that we have worked with have recommended UPI's vs. PKs.

Unfortunately, I do not know how to create them dynamcially with the TD Ent. stage and the replace option.

Just a thought... Have you tried enforcing a unique sort on the same key(s) prior to replacing your table? I wonder if DS is smart enough to recognize that if teh data is unique going into the TD stage, that the PI it creates should also be unique....

Brad.

hamzaqk · Post by **hamzaqk** » Tue Jan 27, 2009 6:25 am

The usage of UPI or USI depends on what you really require and what the needs are in terms of performance. Performance is really an issue when we need to pull data out of a table join it to some other table etc. Primary key is just a concept to enforce uniqueness to a record and in TD we apply this by both USI and UPI. They both work in the same way, the only difference is how the data is internally referenced in Teradata. In UPI you have a hashed value that points to the actual data on the amp. in USI you have a hashed value which then points to a value which has the information pertaining to where the data is stored on the amps. Just an extra hop! Secondary indexes do not form the basis of data distribution also.

P.S a combination of USI and a PI would be a good idea to be put on a table if you want data to be uniformly distributed amongst amps and at the same time to enforce uniqueness on an incoming record.

i am sleepy. hope it made sence