Page 1 of 1

Deleting header of Data file In UNIX

Posted: Tue Dec 25, 2007 11:38 pm
by sribuz
Hi,
I am getting a data file into Unix box and it is generated using COBOL.
The file has Header and footer repeating every page, my objective is to eliminate header and footer and process records between them.

I used sed command to delete Header but the real problem is the previous record is also deleted with header.
When I open the file in vi editor i could see ^L (in blue) between the previous record and header.

I got to know ^L stands for new form.

I need help eliminating header without deleting any records.

Thank you in advance.

Posted: Wed Dec 26, 2007 3:29 am
by saikir
Hi,

If you have some kind of record identifier to identify the Header records then you can remove the header records by having a constraint in the Transformer.

Sai

Posted: Wed Dec 26, 2007 5:40 am
by ArndW
What "sed" command did you use? If you use a line-oriented command in sed it will not recognize ^L as a line break. Perhaps you could do 2 actions or 2 passes - replace your form feed with a normal line break then remove the headers.

Sed command I used

Posted: Wed Dec 26, 2007 12:13 pm
by sribuz
I used

sed -e "/Header Start/,/End Word/d" orginalfile.dat>newfile.dat

This will search for 'headerStart' and delete lines till it reaches 'endword'.
using this command I have deleted the all header instances and saved into newfile.

FYI we put /Header Start/,/End Word/d in double quotes (as in above statement) if the 'Start Word' or 'End Word' have spaces in between.

Posted: Wed Dec 26, 2007 12:26 pm
by ArndW
I'm no sed expert, but you could put in

Code: Select all

sed -e 's/\o014/\o015/' -e "/Header Start/,/End Word/d"

Posted: Wed Dec 26, 2007 1:36 pm
by ray.wurlod
:idea:
Ask the COBOL developers to generate a file that lacks these headers and footers.
All they need to do is to comment out some statements.

Re: ArndW

Posted: Wed Dec 26, 2007 3:17 pm
by sribuz
ArndW wrote:I'm no sed expert, but you could put in

Code: Select all

sed -e 's/\o014/\o015/' -e "/Header Start/,/End Word/d"
...

Tried the sed command given by you, still deleting the record before the header.

Posted: Thu Dec 27, 2007 4:46 am
by ArndW
I was guessing at the sed command. Please the man pages on sed to see what mistake I must have done with the first part; most likely I chose the wrong octal code for the FF character.