DS v8 and Teradata V2R6.2 large decimals

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

DS v8 and Teradata V2R6.2 large decimals

Post by bcarlson »

We are on DS v8.0.1 and writing to Teradata V2R6.2. Our data has 2 fields with datatype of decimal(20,0), but I get a datatype mismatch when writing to the target table that also has these fields defined as dec(20). I have a test job that is just a dataset with these 2 fields and a target Teradata Ent stage.

If I use truncate mode, the job fails with the datatype mismatch error. If I change it to replace mode, then it creates the new table with dec(20) fields but truncates the data down to 18 digits.

DS v8.0.1 is supposed to support V2R6.2 and TTU 8.2 (the corresponding utility bundle, including FastLoad which is what DS uses to write data with the Ent stage). Has anyone proven that it does in fact work? Are there extra settings in either DataStage or the RDBMS that need to be used to 'turn on' the option of using large decimals?

FYI - I had an earlier posting dealing with same issue in DS 7.5.1a. For your reference:

DataStage PX and Teradata V2r6.2 large decimals

Brad.
It is not that I am addicted to coffee, it's just that I need it to survive.
toshea
Participant
Posts: 79
Joined: Thu Aug 14, 2008 6:46 pm

Post by toshea »

Although DS 8.0.1 supports TTU 8.2, it is supported in upward compatibility only. That is, no support was added for new TTU 8.2 features such as BIGINT and large DECIMAL. It only means that compatibility testing was done to ensure that old jobs continue to work with TTU 8.2.

Most stages will allow you to use a String data type to load into a large decimal column, because the conversion is done by the Teradata server and does not require any enhancement to the Teradata stages. But Teradata Enterprise is not one of those stages that will allow it, because it does validation checks to ensure that the link data types match the schema of the target table.

There is a patch for the Teradata Connector stage that supports BIGINT and large DECIMAL, but you need to install the Parallel Transporter to use the Teradata Connector stage.
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

Post by bcarlson »

Okay, this totally sucks. We have spent a huge amount of time and money to get DS v8.0.1 in place with Teradata V2R6.2 with the understanding that we'd be able to use the large decimals. We applied the patch and updated the clispb.dat file with the max_decimals_returned=38 and the job still fails:
getColDesc: decimal field has out of range precision.
Dang it. So when you say Teradata Connector, that is different from the Teradata Enterprise Stage, isn't it? And I would be willing to be that we do not yet have the TPT installed (or 'purchased' may be more accurate since it is - of course - a separate license than the standard TTU).

Brad.
It is not that I am addicted to coffee, it's just that I need it to survive.
toshea
Participant
Posts: 79
Joined: Thu Aug 14, 2008 6:46 pm

Post by toshea »

Yes, the Teradata Connector is a new stage that became available in 8.0.1 FP1. In combines the functionality of the other Teradata stages, but it does require the TPT.

If you raise enough of a fuss with IBM, it's possible you could get an enhancement made to the Teradata Enterprise stage to support BIGINT and large DECIMAL. But most enhancement efforts are going into the Teradata Connector.

Another option you could try would be to use the Teradata Load plug-in instead of Teradata Enterprise. The Teradata Load plug-in may not show up on the default Parallel palette, but you can drag it onto the canvas from the Repository view. It should be there under Stage Types -> Parallel -> Database -> Teradata. The Teradata Load plug-in uses the FastLoad utility, so you'd still get the performance of fastload like Teradata Enterprise. The disadvantage is that the stage would have to run sequential on a single node. You would not get the MPP parallelism of Teradata Enterprise. Since no BIGINT or large DECIMAL enhancements were made to the Teradata Load plug-in either, you'd have to set the input link column to String and let the Teradata server do the conversion.
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

Post by bcarlson »

Update:

We already had TPT (Teradata Parallel Transport) installed and now are working on getting the Teradata Connector working in DataStage. On Frirday, I was able to read and write a small volume (a few hundred recs) of data with the large decimals (decimal(20) was the datatype), and there was mass celebration. Good way to end the week....

Job: TDConn -> sample -> dataset

Along comes Monday... we try the same job with a full volume pull out of Teradata - in this case, about 9.5M records - and BOOM. Heap size errors. Here are some of the logs:
Teradata_Connector_60,0: The current soft limit on the data segment (heap) size (1610612736) is less than the hard limit (214
7483647), consider increasing the heap size limit
Teradata_Connector_60,0: Current heap size: 1,497,937,768 bytes in 14,076,924 blocks
Teradata_Connector_60,0: Failure during execution of operator logic. [api/operator_rep.C:376]
Teradata_Connector_60,0: Fatal Error: Throwing exception: APT_BadAlloc: Heap allocation failed. [error_handling/exception.C:132]
node_myserver: Player 1 terminated unexpectedly. [processmgr/player.C:157]
Sample_58,4: Failure during execution of operator logic. [api/operator_rep.C:376]
Sample_58,4: Fatal Error: waitForWriteSignal(): Premature EOF on node myserver-sw1 Socket operation on non-socket [iomgr/iocomm.C:16
36]
node_myserver-5: Player 1 terminated unexpectedly. [processmgr/player.C:157]
main_program: APT_PMsectionLeader(5, myserver-5), player 1 - Unexpected exit status 1. [processmgr/slprocess.C:363]
APT_PMsectionLeader(1, myserver), player 1 - Unexpected exit status 1. [processmgr/slprocess.C:363]
APT_PMsectionLeader(1, myserver), player 2 - Unexpected exit status 1. [processmgr/slprocess.C:363]
Sample_58,0: Failure during execution of operator logic. [api/operator_rep.C:376]

There is also a warning message that I have not encountered before:
Teradata_Connector_60,0: [IIS-CONN-TERA-005010] The estimated row length of 109 bytes multiplied by the array size of 2,000 is 218,000 which exceeds the maximum buffer size of 64,260
Given that it is referring to exceeding buffers, I am suspicious that this is related to the heap size issues.

We updated the SQL to convert the large decimals to strings and the job ran successfully.

I did some searches already on this forum and found several issues related to heap sizes, but have yet to find any related to v8 and TD Connector. Has anyone run into this error with v8 and TD Connector and large decimals?

Brad.
Last edited by bcarlson on Wed Oct 08, 2008 3:02 pm, edited 1 time in total.
It is not that I am addicted to coffee, it's just that I need it to survive.
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

Post by bcarlson »

By the way, we have updated the ulimits for the userid to unlimited for all categories, and this did not change our results.
myserver:/home/dsadm> ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) unlimited
memory(kbytes) unlimited
coredump(blocks) unlimited
nofiles(descriptors) unlimited
myserver:/home/dsadm> ulimit -aS
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) unlimited
memory(kbytes) unlimited
coredump(blocks) unlimited
nofiles(descriptors) unlimited
myerver:/home/dsadm> ulimit -aH
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) unlimited
memory(kbytes) unlimited
coredump(blocks) unlimited
nofiles(descriptors) unlimited

Brad.
It is not that I am addicted to coffee, it's just that I need it to survive.
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

Post by bcarlson »

Okay, answering part of my question - regarding the warning on the array size and exceeding the buffers:

array size while inserting

We'll give this a try...

Brad
It is not that I am addicted to coffee, it's just that I need it to survive.
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

Post by bcarlson »

Reducing the array size and record count got rid of the warning, but we still failed with the heap size error.

Brad
It is not that I am addicted to coffee, it's just that I need it to survive.
hamzaqk
Participant
Posts: 249
Joined: Tue Apr 17, 2007 5:50 am
Location: islamabad

Post by hamzaqk »

What is your job design? any aggregations or joins?
Teradata Certified Master V2R5
hamzaqk
Participant
Posts: 249
Joined: Tue Apr 17, 2007 5:50 am
Location: islamabad

Post by hamzaqk »

p.s have u set the ulimit limits to unlimited for the same user as in the DS enviornment settings ?
Teradata Certified Master V2R5
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

Post by bcarlson »

Ulimit was set to unlimited across the board. Turns out the issue was the Parallel Synchronization settings. We tried a write mode of both immediate (CLI) and bulk (utility) but had not changed the PS settings at all.

We needed to set the Parallel Synchronization to Yes and set a ync table parm (give DS a table name to use for synchronization, comparable to the old Terasync table used by the TD Ent Stage).

According to the documentation, without this setting, DS tries to run serially. Our guess is that the volumes we were pulling from Teradata overran the buffers because we were reading so slow (in serial, and potentially on one node) and gave us the heap allocation errors. Not sure about that explanation, but it seems to make some sense at least.

We still have some testing and trial and error to go through to figure out the right settings for things, but it looks like our large decimal issue is resolved.

Brad.
It is not that I am addicted to coffee, it's just that I need it to survive.
toshea
Participant
Posts: 79
Joined: Thu Aug 14, 2008 6:46 pm

Post by toshea »

Hmm, sounds more like a memory leak to me. Running in parallel allowed you to work around the issue since the export is now distributed over more processes, so the memory leak did not exhaust each process's heap. You should still open a case with IBM and have the problem investigated.
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

Post by bcarlson »

I believe we have a case open and one of our guys was going to look into a more official explanationfrom IBM of why it now works. Hopefully, if there is a memory leak issue, that will be identifed somewhere in the process.

Thanks!

Brad
It is not that I am addicted to coffee, it's just that I need it to survive.
Post Reply