Need to delete last 2 records in a flat file

rodre · Post by **rodre** » Tue Nov 30, 2010 5:49 pm

I have fixed flat files, that have two bad records at the end of the files.
The last record it has, I think is a carriage return, but not sure, looks like:
The other record has "99999999007563" plus 300 blank spaces. The last digits of this number is the record count.
All other records are only 200 characters long.

Loading this file in server was no problem but we are migrating to parallel and UNIX and it aborts because of this two records.

I did a search but none of the suggestions seem to work well. I was able to use the sed '/999999990/d' <file name> to eliminate one of the records but I am concerned that might find a record in the file with 999999990 and delete that record as well. Also I did not know how to eliminate the , it would not compile.

Your help is much appreciate it!

mail2hfz · Post by **mail2hfz** » Tue Nov 30, 2010 6:10 pm

If you want to get rid of the last 2 records then delete those line in before job subroutine or in a invocation shell (sed 'N;$!P;$!D;$d' <<inp file>> > newfile). If you are planning to handle those records then provide the exact criteria with samples

swapnilverma · Post by **swapnilverma** » Tue Nov 30, 2010 11:17 pm

with unix you can do this ...

reclen=200

cat filename |while read line do

{

count=`wc -c $line`

if (( $count == $reclen)) ; than

echo $line >> good_data_file.txt

else

echo $line >> bad_data_file.txt

}

done

this will eliminate all the bad rows ... regard less their position in file...

alternatively if only last two records have to be removed

count= `wc -l filename`

count=`expr count -2`

head -$count filename >> good_data_filename

Sreenivasulu · Post by **Sreenivasulu** » Wed Dec 01, 2010 1:51 am

As mentioned earlier in the posts,sed is a single line 'simple' command which does the 'quick job'

Regards
Sreeni

chowdhury99 · Post by **chowdhury99** » Wed Dec 01, 2010 9:21 am

You may use awk command 'length ==200' to separate all records with length 200.

Thanks

rodre · Post by **rodre** » Thu Dec 09, 2010 11:25 am

Thank you for all the suggestions. I am new working with UNIX and I run the code bellow but is failing. Can anyone let me knonw what am I doing wrong?

Code: Select all

reclen=199 
cat /apps01/Int/Work/SAS/Files/seqSourceFile.txt |while read line do 
{ 
count=`wc -c $line` 
if (($count == $reclen)) ; then 
echo $line >> /apps01/Int/Work/SAS/Files/seqSourceFile1.txt 
else 
echo $line >> /apps01/Inte/Work/SAS/Files/seqSourceFile2.txt 
} 
done

is giving me this error:

/apps01/Int/Work/SAS/Scripts/SASdeleteBadRows.sh: line 17: syntax error near unexpected token `}'
/apps01/Int/Work/SAS/Scripts/SASVABdeleteBadRows.sh: line 17: `}'
dsadm@axxxxxxxxx-0[/apps01/dsadm](!)$

I appreciate your help!

SwathiCh · Post by **SwathiCh** » Fri Dec 10, 2010 1:04 pm

Remove braces {} from the code and run it.

We no need to give c-style of begining and ending for loops in UNIX.Do is begining of the loop and done meand end of the loop.