the record is too big to fit in a block

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dodda
Premium Member
Premium Member
Posts: 244
Joined: Tue May 29, 2007 11:31 am

the record is too big to fit in a block

Post by dodda »

Hello

I have a job design which reads a seqeuntial file where we are reading each record as a single line and breaking those records into multiple columns with column import stages and we are building the XML chunks for every record and finally we are joining all those chunks to produce a single big xml file.

When i tried to run the job with less data my job is working fine. but when iuse more data the datstage abends the error from the datastage log in as below


APT_CombinedOperatorController(14),0: Internal Error: (!(stat.statusBits() & APT_DMStatus::eRecordTooBig)):api/dataset_rep1.C: 1685: Virtual data set.; output of "inserted tsort operator {key={value=CustomerNumber, subArgs={asc, cs}}}": the record is too big to fit in a block;
the length requested is: 142471.
Traceback: msgAssertion__13APT_FatalPathFPCcRC11APT_UStringPCci() at 0xd47ffd70
putRecordToPartition_grow__14APT_DataSetRepFUi() at 0xd6d28a38
putRecord_nonCombined__14APT_DataSetRepFb() at 0xd6d25d94
putRecord__16APT_OutputCursorFv() at 0xd6f1a208
writeOutputRecord__17APT_TSortOperatorFv() at 0xd4c0c46c
runLocally__30APT_CombinedOperatorControllerFv() at 0xd6f49c28
run__15APT_OperatorRepFv() at 0xd6e8720c
runLocally__14APT_OperatorSCFv() at 0xd6e73bbc
runLocally__Q2_6APT_SC8OperatorFUi() at 0xd6efea44
runLocally__Q2_6APT_IR7ProcessFv() at 0xd6f818c8

Is there any setting that i need to add? while building the xml chunks i used Longvarchar as datatype with lengh being empty.

I have gone through forums and it was discussed that APT_DEFAULT_TRANSPORT_BLOCK_SIZE variable needs to be set.

If so where should i define this environmental variable in administrator?
there are Reporting environmental variables, Operator specific,compiler specific and User defined env varibales.

In which section should i define that value and to which value should i set that?

Thanks for your help
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It doesn't matter where you set it, though the Administrator client only lets you create environment variables in the User Defined folder.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
dodda
Premium Member
Premium Member
Posts: 244
Joined: Tue May 29, 2007 11:31 am

Post by dodda »

Hello Ray,

When i tried to add APT_DEFAULT_TRANSPORT_BLOCK_SIZE environmental variable through administrator it says the variable already exists. But when i looked in the list of variables it is not there. Is there a way that this variable can be configured.

Thanks
dodda
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Look better.

It's in the Parallel folder (not in any of its sub folders).

Default default value is 131072.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rcanaran
Premium Member
Premium Member
Posts: 64
Joined: Wed Jun 14, 2006 3:51 pm
Location: CANADA

Post by rcanaran »

I've reviewed the following posts :
http://dsxchange.com/viewtopic.php?p=23680046
viewtopic.php?t=109868
viewtopic.php?t=109896
viewtopic.php?t=126730

The job is creating XML using parallel xml output stages. The first stage creates individual xml records (chunks) and the stage where its failing is and xml output that aggregates all rows based on a key.

I've tried setting the following in the JOB parameters of a parallel job (DS 7.5.1, on AIX) :

$APT_MAX_TRANSPORT_BLOCK_SIZE 268435456
$APT_MIN_TRANSPORT_BLOCK_SIZE 268435456
$APT_DEFAULT_TRANSPORT_BLOCK_SIZE 268435456

For max transport block size, the help text indicates that the max is 1,048,576. Which I would think would mean its the max for the default block size as well. But if I code 300000000 or the default block size, director issues a WARNING that it is setting the default to 268435456, which is the max. Don't know what the REAL max is for these environment variables. NO Warning is issued for using the value 268435456 for all 3 variables.

But I still get a fatal error :
APT_CombinedOperatorController(1),0: Fatal Error: File data set, file "{0}".; output of "xmoPymntAggr": the record is too big to fit in a block; the length requested is: 500048.

I can take out the MIN and MAX block sizes, but I still get the same error.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It may not be transport blocks you need to change. These are only used for internal transmission of fixed length records. It may be buffer sizes you need to tune. Be very, very careful with these - tuning them through environment variables affects every buffer (link) in the job. It may be better, if possible, to tune buffering per-link.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rcanaran
Premium Member
Premium Member
Posts: 64
Joined: Wed Jun 14, 2006 3:51 pm
Location: CANADA

Post by rcanaran »

Thanks Ray.

In the xml output stage, Output tab, advanced, cahnged buffering from automatic. Increase max buffer by adding 0 on the end (10 x) to from 3145728 to 31457280. Did the same for disk write increment. Changed from 1048576 to 10485760. Dies in exactly the same place. Same message.

Will try more tuning tomorrow.
rcanaran
Premium Member
Premium Member
Posts: 64
Joined: Wed Jun 14, 2006 3:51 pm
Location: CANADA

Post by rcanaran »

Ran into the low watermark message as per viewtopic.php?t=128506.

Is there any way for me to see what values DS is actually using at run time?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There are a few more reporting/tracing hooks (some enabled via environment variables) than are documented. Your official support provider should be able to guide you.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rcanaran
Premium Member
Premium Member
Posts: 64
Joined: Wed Jun 14, 2006 3:51 pm
Location: CANADA

Post by rcanaran »

No parameter setting seemed to work. I was building detail xml chunks in one xml output stage and aggregating in the next (copy stage in between). I've done this before to control the way the xml is formed and haven't encountered this problem before.

I'm still verifying that the generated xml conforms to the schema (requires several more chunks to be built first), but, it appears that I can work around this by aggregating in the first stage. I also set all parameters and variables back to default. (all block size environment variables were deleted and link buffer parameters were reset to default).

I don't know if this applies to the original poster, but so far, this works for me.
rcanaran
Premium Member
Premium Member
Posts: 64
Joined: Wed Jun 14, 2006 3:51 pm
Location: CANADA

Post by rcanaran »

The problem returns when I try to send the output to another stage or to a Dataset Stage. Sending the output to a Sequential File Stage is what solved the problem. What APPEARED to be the solution from the previous post was coincidental as that output was written to a sequential file. As soon as I changed it back to a dataset, the problem reappeared.

So far, adjusting the max/min and default transport block sizes didn't seem to help. Neither did adjusting the buffer parameters on the link. I even tried using NO BUFFER on the link, but this didn't help.
gpatton
Premium Member
Premium Member
Posts: 47
Joined: Mon Jan 05, 2004 8:21 am

Post by gpatton »

You cannot write records to a datasets which have bigger length than the blocksize of the dataset which is by default 128K. You can change that though by setting APT_PHYSICAL_DATASET_BLOCK_SIZE.
rcanaran
Premium Member
Premium Member
Posts: 64
Joined: Wed Jun 14, 2006 3:51 pm
Location: CANADA

Post by rcanaran »

Thanks. That seemed to work.
Post Reply