How to process Variable Length File

rohit.agarwalin · Post by **rohit.agarwalin** » Mon Jul 15, 2013 5:49 am

Hi,

I have a requirement to read the variable length file (ASCII - having packed decimals in each record type) which does not have any record delimiter.
There are different types of records present in file and all they have different length. Each record has 2 byte field (RDW) which has record length and after this field it has entire record of that length plus some offset at the end of record to make entire record length multiple of 4.

e.g. [RDW(2 byte field)=12][12 byte record][2 byte padding]
since record length including RDW is 14 byte so there is padding of 2 byte to make it multiple of 4 byte.

Could any one help me to suggest how to read this file. Can I read this file using CFF stage or I need to write some program to read it.

I have gone through different topics in this forum but did not get answer I am looking for. In case I missed any post giving solution of this please let me know the link.

Thanks,
Rohit

chulett · Post by **chulett** » Mon Jul 15, 2013 7:15 am

Well... the variable length record part with packed decimals and a record length field at the front would work in the CFF stage - up until the 'padded to a multiple of 4' part, which throws a damper on things. I'm thinking you'd have to go 'BuildOp' (or some other custom solution) here but curious what others think.

How will you recognize each of the 'record types' in the file, do they each have a unique record length? Or some record type identifier field?

rohit.agarwalin · Post by **rohit.agarwalin** » Mon Jul 15, 2013 8:53 am

Record type will be recognised by the values in RDW field. Each record type have different length so the value of RDW field.
There is no record delimiter.

I have got a workaround using CFF stage using pure TEXT file. Now I will perform this test using the HEX and packed values. In real file RDW have record length values in HEX.

For each record type we know how many bytes have been padded so I will declare that as part of record layout.

I will let you know once I get success in next Test.

Thanks for your help and please let me know if you have better way to read the file or in case my test does not work then I will be back with same question.

ArndW · Post by **ArndW** » Mon Jul 15, 2013 11:15 am

You are on the right track, as the CFF file allows multiple record types with each of those possibly containing a different number of columns and having a different length.

chulett · Post by **chulett** » Mon Jul 15, 2013 11:30 am

However, the length noted in the field is *not* the actual length as it is then "padded to a multiple of 4" if needed... that seems like an issue to me.

ArndW · Post by **ArndW** » Mon Jul 15, 2013 12:14 pm

The column "RDW" is used to determine the type of record, then depending on that a given record layout can be used/defined in the CFF stage - and that will be some number of packed fields.