error in px

samsuf2002 · Post by **samsuf2002** » Wed Sep 13, 2006 10:22 am

Hi
i am running a job in parallel it is getting aborted and giving this fatal "Consumed more than 100000 bytes looking for record delimiter; aborting" i am using SQLserver my job contains change capture and lookup.

Plz help
sam

samsuf2002 · Post by **samsuf2002** » Wed Sep 13, 2006 10:36 am

delimeter i am using is pipe

ray.wurlod · Post by **ray.wurlod** » Wed Sep 13, 2006 2:37 pm

Are you sure that you've specified the delimiter character correctly in the job design? (Remember that there are three delimiters; field delimiter, final delimiter and record delimiter.)

The message suggests that the job has read 100000 bytes without even finding one delimiter character. I'd give up too! Do you have extremely large VarChar data types?

rwierdsm · Post by **rwierdsm** » Mon Nov 20, 2006 1:57 pm

Is anyone aware of a setting that will bump up the 100000 bytes to a higher amount?

I need to import records that may have several 32k fields, possibly going over the 100000 byte limit.

Yikes

ray.wurlod · Post by **ray.wurlod** » Mon Nov 20, 2006 2:25 pm

If you get the delimiter right it shouldn't matter, unless one field itself will exceed that limit. Double-eek!

rwierdsm · Post by **rwierdsm** » Mon Nov 20, 2006 2:47 pm

I'm reading in the whole record as one string and breaking it up into many fields later on. It is entirely possible that one field(record) will be longer than 100000 bytes.

I've found an evironment variable that is supposed to allow for a bigger limit

APT_MAX_DELIMITED_READ_SIZE

I'll let you know how it works.

Rob W

ray.wurlod · Post by **ray.wurlod** » Mon Nov 20, 2006 3:55 pm

Reading one field will not be searching for a field delimiter, so you should be fine.

rwierdsm · Post by **rwierdsm** » Tue Nov 21, 2006 9:38 am

We set APT_MAX_DELIMITED_READ_SIZE to 150000 from it's default of 100000 and the job read in our singular file.

ray.wurlod · Post by **ray.wurlod** » Tue Nov 21, 2006 12:20 pm

Thanks for letting us know.

My reaction (yuk) is unchanged.

rwierdsm · Post by **rwierdsm** » Fri Nov 24, 2006 9:05 am

ray.wurlod wrote:Thanks for letting us know.

My reaction (yuk) is unchanged.

Can't disagree, but it's what I'm stuck with!

Rob W.

rwierdsm · Post by **rwierdsm** » Fri Apr 13, 2007 8:52 am

Further to this thread....

We are now approaching the end of our testing phase and have needed to revisit the values for APT's mentioned above, so I'd like to document our findings.

To briefly summarize, we have a file that we read in as a single wide column, splitting into component columns later in processing. As some of these component columns can be quite wide, up to 32k, the single wide column approaches 200k bytes in length.

To accommodate this file, we are reading with a sequential stage. The DS job needed to have two environment variable set:

- APT_DEFAULT_TRANSPORT_BLOCK_SIZE
- APT_MAX_DELIMITED_READ_SIZE

During our latest tests, the source application sent us a file where all fields are filled in to the max size. Our initial settings for APT_MAX_DELIMITED_READ_SIZE of 200000 were insufficient, we received the following log message:

Code: Select all

Input_File,3: Consumed more than 100000 bytes looking for record delimiter; aborting

Don't know why the log message referred to 100000 bytes when our setting was for 200000 bytes and this actually caused us to suspect that DS was ignoring the setting, however, we continued to play with the values.

After many runs we found that we needed to increase our setting for this value to 550000 (much bigger than our actual rows size - don't know why!)

When the message above was cleared, we found we were getting the following log entry:

Code: Select all

Input_File,3: Fatal Error: Virtual dataset; output of "Input_File": Record too big to fit in a block; length requested: 133173

This prompted us to add an entry for APT_DEFAULT_TRANSPORT_BLOCK_SIZE and set it to 262144 (256k).

This seems to have made DataStage quite happy.

Rob W.

sud · Post by **sud** » Fri Apr 13, 2007 2:25 pm

Very recently I came across the same issue regarding dataset record size and increased the APT_DEFAULT_TRANSPORT_BLOCK_SIZE to resolve it. What I found was even if I don't have records that long while running the job with test data (because in anticipation of putting field lengths correctly to handle actual data I had specified varchars of big length) datastage allocates the total length in memory. So a varchar is not actually a varchar since internally it is all C data types. And even worse, the job execution time varies drastically because with longer data lengths it runs into buffering which slows things down.

ray.wurlod · Post by **ray.wurlod** » Fri Apr 13, 2007 4:09 pm

Transport blocks apply only for fixed-length records and only between player processes. I guess by the time you're in a virtual Data Set you effectively do have fixed-length records, because bounded VarChar is stored as prefix bytes and full allocation of character space.