Read Binary (BCD) file with variable length records

avazeos · Post by **avazeos** » Tue Oct 17, 2006 7:02 am

Hi to all,

I am a new comer in the forum and I need your help for the following issue(s).

The task that we must do using Datastage is to read a binary file (CDRs) that is in BCD format and translate it in ascii format. For this i want someone to describe me which is the best practise for doing so using Datastage.
Another problem with the binary file is that it is not a delimited file and the fileds have variable length. In order to understand the length of a variable length field there is a standard field before that with the length of the next field. So I would also like a best practise approach for this situation.

Thanks in advance.

ray.wurlod · Post by **ray.wurlod** » Tue Oct 17, 2006 8:26 am

Welcome aboard. :D

What you describe is a very unusual scenario - a binary file with other than fixed length records. Do you have any description (such as a COBOL file definition) of the structure? It may be that there is more than one record type - you need to know this before you can proceed.

Once you have the file ("Table") definition you can probably use the Complex Flat File stage to read the file and automatically unpack the binary data.

avazeos · Post by **avazeos** » Wed Oct 18, 2006 12:46 am

Hi,
thanks for your answer.
I have the description of the file format. The file structure is like that:
In the beginning of the row there are some fixed length fileds that are standard in every record and after that there is a field with the total length of the record annd then there are some variable length records that first they have a one byte field with the size of the following record.

ray.wurlod · Post by **ray.wurlod** » Wed Oct 18, 2006 12:42 pm

That's fine. If you can represent that as a COBOL format and import from that (put it in a file) then you can readily use the CFF stage. Is every field BCD, or do you have a mix of encodings?

Otherwise you can read the entire record as a VarChar (I've assumed that each line has a line terminator) and parse it in a Transformer stage using substring techniques, having constructed a table definition from the information that you have.

To process these CDRs in DataStage, you will need to convert the BCD numerics into "regular" decimal. I'm in an Internet Cafe at the moment, but am fairly sure there are some SDK routines that can help.

clshore · Post by **clshore** » Wed Oct 18, 2006 4:44 pm

Are these Call Detail Records?
Depending on the record format, there is often feature encoding at the bit level, as well as BCD encoding of digits.
Not convinced that DataStage is the right tool for this job, but if you insist, then you wil probably have to disassemble each record piece by piece, ie the record header with lengths first, then each subfield, length byte first, the number of nibbles or bytes, then the next, etc.
Sounds like a job for a UnivBasic routine, the array processing and bitwise operators will be your friends.

Have fun

Carter