Page 1 of 1

Complex Flat File stage - variable record lengths

Posted: Tue Sep 06, 2011 7:30 pm
by djbarham
I'm having a problem variable record lengths on the CFF stage.

I think my problem is the same as this thread, but I don't think respondants there really understood the problem:

viewtopic.php?p=391304

I have several different record types and I have the meta data defined for each record type. All records end with a Unix newline.

The problem is that some of the fields on the end of some records are optional and as a result, some input records end before the end of the definition. In this case, these records are rejected with an input buffer overrun error.

It doesn't seem to matter how I define the fields that are missing from the input record (nullable or set a default), I still get the error.

Whether or not the optional fields exist does not depend on the record type.

For example, if I have record types A, B and C.

Record types A and B are always filled and work fine. Record type C will be rejected if the record is shorter than the metadata.

If I remove extra fields from the definition of record tye C, then these records no longer give an error. As soon as I define even a single field that goes past the physical end of the shortest record, these shorter records are rejected. The longer type C records still come through.

Is there a solution to this or is this a limitation of the CFF stage?

Re: Complex Flat File stage - variable record lengths

Posted: Tue Sep 06, 2011 8:35 pm
by djbarham
Pending a resolution of this with the CFF stage, I think build another job to preprocess the file by padding out records to their respective maximum lengths.

Posted: Tue Sep 06, 2011 10:34 pm
by jwiles
CFF does not readily support a single record type having varying numbers of columns, outside of the support of the "OCCURS DEPENDING ON" COBOL FD clause.

The solution you mention (pre-processing the file) will likely work by creating your missing columns for you. Another possibility, should your file format support it, is to define the 'C' record type as a single variable-length column in CFF then use a transformer to parse out the columns which are present while providing default values for the missing columns.

Regards,

Posted: Wed Sep 07, 2011 12:10 am
by djbarham
jwiles wrote:CFF does not readily support a single record type having varying numbers of columns, outside of the support of the "OCCURS DEPENDING ON" COBOL FD clause.
Hmmm ... I wouldn't call it a varying number of columns so much as optional columns. ;)

If it hits the end of record before the end of the definition, I just want it to default the remaining columns. Enhancement request maybe. :)
The solution you mention (pre-processing the file) will likely work by creating your missing columns for you.
Yep, already built this and it works fine. It just pads out every record to a predefined length based on the record type.
Another possibility, should your file format support it, is to define the 'C' record type as a single variable-length column in CFF then use a transformer to parse out the columns which are present while providing default values for the missing columns.
Yeah, ah ... no, that is what i was trying to avoid.

Thanks, you have confirmed what I was beginning to suspect - that the CFF stage cannot handle records shorter than the record definition (it does not seem to mind if they are longer).

Posted: Wed Sep 07, 2011 8:55 pm
by jwiles
Varying columns/optional columns: Means the same thing :)

A variation of the second solution would be to make the last always-there column an varchar and use a transformer to parse it and your "optional" columns (or a transformer to pad and a column importer to parse). I generally prefer this solution because it's relatively simple and doesn't require an addition pass of the data file (and hence additional storage for the modified file). That can make a difference as data volumes increase.

Regards,

Posted: Mon Dec 05, 2011 10:36 am
by PhilHibbs
The way we have got around this is by setting the FTP transfer mode to "linemode=rdw" on the Mainframe that sends us the file, and defining two 2-byte bigendian fields "RDW_LEN" and "RDW_FILLER" at the beginning of the record. Then, use RDW_LEN in the Records ID tab to identify what kind of record it is based on the length. Bad luck if you have two different record types of the same length...