DS v8 and Teradata V2R6.2 large decimals

bcarlson · Post by **bcarlson** » Mon Sep 29, 2008 6:20 pm

We are on DS v8.0.1 and writing to Teradata V2R6.2. Our data has 2 fields with datatype of decimal(20,0), but I get a datatype mismatch when writing to the target table that also has these fields defined as dec(20). I have a test job that is just a dataset with these 2 fields and a target Teradata Ent stage.

If I use truncate mode, the job fails with the datatype mismatch error. If I change it to replace mode, then it creates the new table with dec(20) fields but truncates the data down to 18 digits.

DS v8.0.1 is supposed to support V2R6.2 and TTU 8.2 (the corresponding utility bundle, including FastLoad which is what DS uses to write data with the Ent stage). Has anyone proven that it does in fact work? Are there extra settings in either DataStage or the RDBMS that need to be used to 'turn on' the option of using large decimals?

FYI - I had an earlier posting dealing with same issue in DS 7.5.1a. For your reference:

DataStage PX and Teradata V2r6.2 large decimals

Brad.

toshea · Post by **toshea** » Mon Sep 29, 2008 8:37 pm

Although DS 8.0.1 supports TTU 8.2, it is supported in upward compatibility only. That is, no support was added for new TTU 8.2 features such as BIGINT and large DECIMAL. It only means that compatibility testing was done to ensure that old jobs continue to work with TTU 8.2.

Most stages will allow you to use a String data type to load into a large decimal column, because the conversion is done by the Teradata server and does not require any enhancement to the Teradata stages. But Teradata Enterprise is not one of those stages that will allow it, because it does validation checks to ensure that the link data types match the schema of the target table.

There is a patch for the Teradata Connector stage that supports BIGINT and large DECIMAL, but you need to install the Parallel Transporter to use the Teradata Connector stage.

bcarlson · Post by **bcarlson** » Tue Sep 30, 2008 9:58 pm

Okay, this totally sucks. We have spent a huge amount of time and money to get DS v8.0.1 in place with Teradata V2R6.2 with the understanding that we'd be able to use the large decimals. We applied the patch and updated the clispb.dat file with the max_decimals_returned=38 and the job still fails:

getColDesc: decimal field has out of range precision.

Dang it. So when you say Teradata Connector, that is different from the Teradata Enterprise Stage, isn't it? And I would be willing to be that we do not yet have the TPT installed (or 'purchased' may be more accurate since it is - of course - a separate license than the standard TTU).

Brad.

toshea · Post by **toshea** » Wed Oct 01, 2008 6:41 am

Yes, the Teradata Connector is a new stage that became available in 8.0.1 FP1. In combines the functionality of the other Teradata stages, but it does require the TPT.

If you raise enough of a fuss with IBM, it's possible you could get an enhancement made to the Teradata Enterprise stage to support BIGINT and large DECIMAL. But most enhancement efforts are going into the Teradata Connector.

Another option you could try would be to use the Teradata Load plug-in instead of Teradata Enterprise. The Teradata Load plug-in may not show up on the default Parallel palette, but you can drag it onto the canvas from the Repository view. It should be there under Stage Types -> Parallel -> Database -> Teradata. The Teradata Load plug-in uses the FastLoad utility, so you'd still get the performance of fastload like Teradata Enterprise. The disadvantage is that the stage would have to run sequential on a single node. You would not get the MPP parallelism of Teradata Enterprise. Since no BIGINT or large DECIMAL enhancements were made to the Teradata Load plug-in either, you'd have to set the input link column to String and let the Teradata server do the conversion.

bcarlson · Post by **bcarlson** » Wed Oct 08, 2008 2:53 pm

Update:

We already had TPT (Teradata Parallel Transport) installed and now are working on getting the Teradata Connector working in DataStage. On Frirday, I was able to read and write a small volume (a few hundred recs) of data with the large decimals (decimal(20) was the datatype), and there was mass celebration. Good way to end the week....

Job: TDConn -> sample -> dataset

Along comes Monday... we try the same job with a full volume pull out of Teradata - in this case, about 9.5M records - and BOOM. Heap size errors. Here are some of the logs:

Teradata_Connector_60,0: The current soft limit on the data segment (heap) size (1610612736) is less than the hard limit (214
7483647), consider increasing the heap size limit
Teradata_Connector_60,0: Current heap size: 1,497,937,768 bytes in 14,076,924 blocks
Teradata_Connector_60,0: Failure during execution of operator logic. [api/operator_rep.C:376]
Teradata_Connector_60,0: Fatal Error: Throwing exception: APT_BadAlloc: Heap allocation failed. [error_handling/exception.C:132]
node_myserver: Player 1 terminated unexpectedly. [processmgr/player.C:157]
Sample_58,4: Failure during execution of operator logic. [api/operator_rep.C:376]
Sample_58,4: Fatal Error: waitForWriteSignal(): Premature EOF on node myserver-sw1 Socket operation on non-socket [iomgr/iocomm.C:16
36]
node_myserver-5: Player 1 terminated unexpectedly. [processmgr/player.C:157]
main_program: APT_PMsectionLeader(5, myserver-5), player 1 - Unexpected exit status 1. [processmgr/slprocess.C:363]
APT_PMsectionLeader(1, myserver), player 1 - Unexpected exit status 1. [processmgr/slprocess.C:363]
APT_PMsectionLeader(1, myserver), player 2 - Unexpected exit status 1. [processmgr/slprocess.C:363]
Sample_58,0: Failure during execution of operator logic. [api/operator_rep.C:376]

There is also a warning message that I have not encountered before:

Teradata_Connector_60,0: [IIS-CONN-TERA-005010] The estimated row length of 109 bytes multiplied by the array size of 2,000 is 218,000 which exceeds the maximum buffer size of 64,260

Given that it is referring to exceeding buffers, I am suspicious that this is related to the heap size issues.

We updated the SQL to convert the large decimals to strings and the job ran successfully.

I did some searches already on this forum and found several issues related to heap sizes, but have yet to find any related to v8 and TD Connector. Has anyone run into this error with v8 and TD Connector and large decimals?

Brad.

bcarlson · Post by **bcarlson** » Wed Oct 08, 2008 2:55 pm

By the way, we have updated the ulimits for the userid to unlimited for all categories, and this did not change our results.

myserver:/home/dsadm> ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) unlimited
memory(kbytes) unlimited
coredump(blocks) unlimited
nofiles(descriptors) unlimited
myserver:/home/dsadm> ulimit -aS
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) unlimited
memory(kbytes) unlimited
coredump(blocks) unlimited
nofiles(descriptors) unlimited
myerver:/home/dsadm> ulimit -aH
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) unlimited
memory(kbytes) unlimited
coredump(blocks) unlimited
nofiles(descriptors) unlimited

Brad.

bcarlson · Post by **bcarlson** » Wed Oct 08, 2008 4:16 pm

Okay, answering part of my question - regarding the warning on the array size and exceeding the buffers:

array size while inserting

We'll give this a try...

Brad

bcarlson · Post by **bcarlson** » Wed Oct 08, 2008 4:28 pm

Reducing the array size and record count got rid of the warning, but we still failed with the heap size error.

Brad

hamzaqk · Post by **hamzaqk** » Sun Oct 12, 2008 10:56 pm

What is your job design? any aggregations or joins?

hamzaqk · Post by **hamzaqk** » Mon Oct 13, 2008 12:24 am

p.s have u set the ulimit limits to unlimited for the same user as in the DS enviornment settings ?

bcarlson · Post by **bcarlson** » Mon Oct 13, 2008 11:48 am

Ulimit was set to unlimited across the board. Turns out the issue was the Parallel Synchronization settings. We tried a write mode of both immediate (CLI) and bulk (utility) but had not changed the PS settings at all.

We needed to set the Parallel Synchronization to Yes and set a ync table parm (give DS a table name to use for synchronization, comparable to the old Terasync table used by the TD Ent Stage).

According to the documentation, without this setting, DS tries to run serially. Our guess is that the volumes we were pulling from Teradata overran the buffers because we were reading so slow (in serial, and potentially on one node) and gave us the heap allocation errors. Not sure about that explanation, but it seems to make some sense at least.

We still have some testing and trial and error to go through to figure out the right settings for things, but it looks like our large decimal issue is resolved.

Brad.

toshea · Post by **toshea** » Mon Oct 13, 2008 3:04 pm

Hmm, sounds more like a memory leak to me. Running in parallel allowed you to work around the issue since the export is now distributed over more processes, so the memory leak did not exhaust each process's heap. You should still open a case with IBM and have the problem investigated.

bcarlson · Post by **bcarlson** » Mon Oct 13, 2008 3:56 pm

I believe we have a case open and one of our guys was going to look into a more official explanationfrom IBM of why it now works. Hopefully, if there is a memory leak issue, that will be identifed somewhere in the process.

Thanks!

Brad