Page 1 of 1

error in px

Posted: Wed Sep 13, 2006 10:22 am
by samsuf2002
Hi
i am running a job in parallel it is getting aborted and giving this fatal "Consumed more than 100000 bytes looking for record delimiter; aborting" i am using SQLserver my job contains change capture and lookup.

Plz help
sam

Posted: Wed Sep 13, 2006 10:36 am
by samsuf2002
delimeter i am using is pipe

Posted: Wed Sep 13, 2006 2:37 pm
by ray.wurlod
Are you sure that you've specified the delimiter character correctly in the job design? (Remember that there are three delimiters; field delimiter, final delimiter and record delimiter.)

The message suggests that the job has read 100000 bytes without even finding one delimiter character. I'd give up too! Do you have extremely large VarChar data types?

Posted: Mon Nov 20, 2006 1:57 pm
by rwierdsm
Is anyone aware of a setting that will bump up the 100000 bytes to a higher amount?

I need to import records that may have several 32k fields, possibly going over the 100000 byte limit.

Yikes :!:

Posted: Mon Nov 20, 2006 2:25 pm
by ray.wurlod
If you get the delimiter right it shouldn't matter, unless one field itself will exceed that limit. Double-eek!

Posted: Mon Nov 20, 2006 2:47 pm
by rwierdsm
I'm reading in the whole record as one string and breaking it up into many fields later on. It is entirely possible that one field(record) will be longer than 100000 bytes.

I've found an evironment variable that is supposed to allow for a bigger limit

APT_MAX_DELIMITED_READ_SIZE

I'll let you know how it works.

Rob W

Posted: Mon Nov 20, 2006 3:55 pm
by ray.wurlod
Reading one field will not be searching for a field delimiter, so you should be fine.

Posted: Tue Nov 21, 2006 9:38 am
by rwierdsm
We set APT_MAX_DELIMITED_READ_SIZE to 150000 from it's default of 100000 and the job read in our singular file.

Posted: Tue Nov 21, 2006 12:20 pm
by ray.wurlod
Thanks for letting us know.

My reaction (yuk) is unchanged.

Posted: Fri Nov 24, 2006 9:05 am
by rwierdsm
ray.wurlod wrote:Thanks for letting us know.

My reaction (yuk) is unchanged.
Can't disagree, but it's what I'm stuck with!

Rob W.

Posted: Fri Apr 13, 2007 8:52 am
by rwierdsm
Further to this thread....

We are now approaching the end of our testing phase and have needed to revisit the values for APT's mentioned above, so I'd like to document our findings.

To briefly summarize, we have a file that we read in as a single wide column, splitting into component columns later in processing. As some of these component columns can be quite wide, up to 32k, the single wide column approaches 200k bytes in length.

To accommodate this file, we are reading with a sequential stage. The DS job needed to have two environment variable set:
  • - APT_DEFAULT_TRANSPORT_BLOCK_SIZE
    - APT_MAX_DELIMITED_READ_SIZE
During our latest tests, the source application sent us a file where all fields are filled in to the max size. Our initial settings for APT_MAX_DELIMITED_READ_SIZE of 200000 were insufficient, we received the following log message:

Code: Select all

Input_File,3: Consumed more than 100000 bytes looking for record delimiter; aborting
Don't know why the log message referred to 100000 bytes when our setting was for 200000 bytes and this actually caused us to suspect that DS was ignoring the setting, however, we continued to play with the values.

After many runs we found that we needed to increase our setting for this value to 550000 (much bigger than our actual rows size - don't know why!)

When the message above was cleared, we found we were getting the following log entry:

Code: Select all

Input_File,3: Fatal Error: Virtual dataset; output of "Input_File": Record too big to fit in a block; length requested: 133173
This prompted us to add an entry for APT_DEFAULT_TRANSPORT_BLOCK_SIZE and set it to 262144 (256k).

This seems to have made DataStage quite happy.

Rob W.

Posted: Fri Apr 13, 2007 2:25 pm
by sud
Very recently I came across the same issue regarding dataset record size and increased the APT_DEFAULT_TRANSPORT_BLOCK_SIZE to resolve it. What I found was even if I don't have records that long while running the job with test data (because in anticipation of putting field lengths correctly to handle actual data I had specified varchars of big length) datastage allocates the total length in memory. So a varchar is not actually a varchar since internally it is all C data types. And even worse, the job execution time varies drastically because with longer data lengths it runs into buffering which slows things down.

Posted: Fri Apr 13, 2007 4:09 pm
by ray.wurlod
Transport blocks apply only for fixed-length records and only between player processes. I guess by the time you're in a virtual Data Set you effectively do have fixed-length records, because bounded VarChar is stored as prefix bytes and full allocation of character space.