Flat File with Header, Body And Footer

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
derekchu
Participant
Posts: 23
Joined: Sun Sep 24, 2006 10:09 pm

Flat File with Header, Body And Footer

Post by derekchu »

Dear all,

One of my input file is logically divided into three parts: Header, Body, and Footer. The formats are listed as follow:

Header:
REC_TYPE:string[1]
HDR1:string[10]
HDR2:string[10]
...
Body
REC_TYPE:string[1]
COL1:decimal[10]
COL2:string[5]
...
Footer:
REC_TYPE:string[1]
FTR1:string[10]
FTR2:string[10]
...

To distinish which part of the file it is, we check the first column (REC_TYPE), '0' means header, '1' means body and '2' means footer.

Is it possible to use the complex flat file stage to load the file into datastage?

Regards,
Derek CHU
Kirtikumar
Participant
Posts: 437
Joined: Fri Oct 15, 2004 6:13 am
Location: Pune, India

Post by Kirtikumar »

What type of file is this?

Is it a normal file that can be seen using cat command or vi editor? Or is getting generated from mainframe system.

CFF stage is used when the file is generated from MF system.
Regards,
S. Kirtikumar.
derekchu
Participant
Posts: 23
Joined: Sun Sep 24, 2006 10:09 pm

Post by derekchu »

Yes, the file is coming from Mainframe.

What I want to do is to load the header, body and footer into different datasets.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Kirtikumar wrote:Is it a normal file that can be seen using cat command or vi editor? Or is getting generated from mainframe system.
Mainframe systems produce 'normal' files, too. :wink:

Are you trying to ask if it is EBCDIC or has packed fields in it? That would help make the case for using the CFF stage, but its use certainly isn't limited to 'mainframe' files... more like, well 'complex' files regardless of source.

I haven't had the pleasure, so can't help with the gory details but the CFF stage should be able to handle your file. Have you gone through the Help that's available for it?
-craig

"You can never have too many knives" -- Logan Nine Fingers
Kirtikumar
Participant
Posts: 437
Joined: Fri Oct 15, 2004 6:13 am
Location: Pune, India

Post by Kirtikumar »

chulett wrote:Mainframe systems produce 'normal' files, too. :wink:
Are you trying to ask if it is EBCDIC or has packed fields in it?
Ya I ment EBCDIC file.

I am aware that mainframe can generate a normal file as well so should have mentioned as EBCDIC earlier.

If coming file from mainframe is normal one then just reading the file with two cols - one as recordtype and rest of the line as single col should work. then u can use filter to seperate out the header, detail and trailer record. Once this is done, you can separate out diff cols from it.

Never tried this approach on complex file. But it should work.
Regards,
S. Kirtikumar.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Yup, you can always take that approach as long as you don't mind parsing out all of the individual columns for each piece yourself. Pretty sure the CFF stage will automate all that.
-craig

"You can never have too many knives" -- Logan Nine Fingers
talk2shaanc
Charter Member
Charter Member
Posts: 199
Joined: Tue Jan 18, 2005 2:50 am
Location: India

Post by talk2shaanc »

If its ASCII flat file (mentioned as normal file, in previous posts)..there are several approach..

1. Split the file upfront in a script (shell or perl or cosort) and in the datastage job do your remaining logic.

2. If you dont need header and trailer record in your DS job. Use external source stage, and in the metadata of External source stage define the layout of your detail record (body). In the Source program you can have

Code: Select all

grep ^1 <file name> 
People have perception that it takes time, but my experience with 24million records didnt take much time. It was pretty much same as a sequential file stage.

3. have only two columns rec_type and all_fields. Use filter stage to split the records and write it to file and then in next job use the three files.

4. Ugly and dirty way. Use sequential file stage, with layout of detail record. with Reject Mode =Output. This will write your header and trailer to the reject file and remaining record will move forward. A caveat, if there are some detail(body) record having invalid or junk values for some of the fields, it will also be written to the reject file. You can also skip the header by giving "First Line Is Column Name".
Shantanu Choudhary
derekchu
Participant
Posts: 23
Joined: Sun Sep 24, 2006 10:09 pm

Post by derekchu »

Thanks for Shantanu's reply.

We would like to avoid using any "split" approach (to avoid unnecessary file landing).

Is there any way to handle this kind of source file in Data Stage so that it can be directly loaded into Data Stage.

Thanks,

Derek Chu
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You mean other than the approaches you've already been given in the other posts in this thread? :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
talk2shaanc
Charter Member
Charter Member
Posts: 199
Joined: Tue Jan 18, 2005 2:50 am
Location: India

Post by talk2shaanc »

derekchu wrote:Thanks for Shantanu's reply.

We would like to avoid using any "split" approach (to avoid unnecessary file landing).

Is there any way to handle this kind of source file in Data Stage so that it can be directly loaded into Data Stage.

Thanks,

Derek Chu
If you want to avoid landing of files, option2 is the best option you have
Shantanu Choudhary
Post Reply