Flat File with Header, Body And Footer

derekchu · Post by **derekchu** » Tue Nov 07, 2006 2:11 am

Dear all,

One of my input file is logically divided into three parts: Header, Body, and Footer. The formats are listed as follow:

Header:
REC_TYPE:string[1]
HDR1:string[10]
HDR2:string[10]
...
Body
REC_TYPE:string[1]
COL1:decimal[10]
COL2:string[5]
...
Footer:
REC_TYPE:string[1]
FTR1:string[10]
FTR2:string[10]
...

To distinish which part of the file it is, we check the first column (REC_TYPE), '0' means header, '1' means body and '2' means footer.

Is it possible to use the complex flat file stage to load the file into datastage?

Regards,
Derek CHU

Kirtikumar · Post by **Kirtikumar** » Tue Nov 07, 2006 4:38 am

What type of file is this?

Is it a normal file that can be seen using cat command or vi editor? Or is getting generated from mainframe system.

CFF stage is used when the file is generated from MF system.

derekchu · Post by **derekchu** » Tue Nov 07, 2006 7:23 am

Yes, the file is coming from Mainframe.

What I want to do is to load the header, body and footer into different datasets.

chulett · Post by **chulett** » Tue Nov 07, 2006 7:35 am

Kirtikumar wrote:Is it a normal file that can be seen using cat command or vi editor? Or is getting generated from mainframe system.

Mainframe systems produce 'normal' files, too.

Are you trying to ask if it is EBCDIC or has packed fields in it? That would help make the case for using the CFF stage, but its use certainly isn't limited to 'mainframe' files... more like, well 'complex' files regardless of source.

I haven't had the pleasure, so can't help with the gory details but the CFF stage should be able to handle your file. Have you gone through the Help that's available for it?

Kirtikumar · Post by **Kirtikumar** » Tue Nov 07, 2006 7:44 am

chulett wrote:Mainframe systems produce 'normal' files, too.
Are you trying to ask if it is EBCDIC or has packed fields in it?

Ya I ment EBCDIC file.

I am aware that mainframe can generate a normal file as well so should have mentioned as EBCDIC earlier.

If coming file from mainframe is normal one then just reading the file with two cols - one as recordtype and rest of the line as single col should work. then u can use filter to seperate out the header, detail and trailer record. Once this is done, you can separate out diff cols from it.

Never tried this approach on complex file. But it should work.

chulett · Post by **chulett** » Tue Nov 07, 2006 7:57 am

Yup, you can always take that approach as long as you don't mind parsing out all of the individual columns for each piece yourself. Pretty sure the CFF stage will automate all that.

talk2shaanc · Post by **talk2shaanc** » Tue Nov 07, 2006 8:48 am

If its ASCII flat file (mentioned as normal file, in previous posts)..there are several approach..

1. Split the file upfront in a script (shell or perl or cosort) and in the datastage job do your remaining logic.

2. If you dont need header and trailer record in your DS job. Use external source stage, and in the metadata of External source stage define the layout of your detail record (body). In the Source program you can have

Code: Select all

grep ^1 <file name>

People have perception that it takes time, but my experience with 24million records didnt take much time. It was pretty much same as a sequential file stage.

3. have only two columns rec_type and all_fields. Use filter stage to split the records and write it to file and then in next job use the three files.

4. Ugly and dirty way. Use sequential file stage, with layout of detail record. with Reject Mode =Output. This will write your header and trailer to the reject file and remaining record will move forward. A caveat, if there are some detail(body) record having invalid or junk values for some of the fields, it will also be written to the reject file. You can also skip the header by giving "First Line Is Column Name".

derekchu · Post by **derekchu** » Tue Nov 07, 2006 5:00 pm

Thanks for Shantanu's reply.

We would like to avoid using any "split" approach (to avoid unnecessary file landing).

Is there any way to handle this kind of source file in Data Stage so that it can be directly loaded into Data Stage.

Thanks,

Derek Chu

chulett · Post by **chulett** » Tue Nov 07, 2006 7:18 pm

You mean other than the approaches you've already been given in the other posts in this thread?

talk2shaanc · Post by **talk2shaanc** » Wed Nov 08, 2006 7:55 am

derekchu wrote:Thanks for Shantanu's reply.

We would like to avoid using any "split" approach (to avoid unnecessary file landing).

Is there any way to handle this kind of source file in Data Stage so that it can be directly loaded into Data Stage.

Thanks,

Derek Chu

If you want to avoid landing of files, option2 is the best option you have