Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.
Moderators: chulett , rschirm , roy
pandu80
Participant
Posts: 50 Joined: Fri Apr 08, 2005 5:56 pm
Post
by pandu80 » Thu Jun 02, 2005 11:12 am
Hi ,
I would like count the no of lines in a text file.
Iam using the 'cat filename | wc -l' which returns the no of lines.
But Before that i need to eliminate the empty lines in the text file.
how can i achieve this?.
Any help would be appreciated.
TIA
amsh76
Charter Member
Posts: 118 Joined: Wed Mar 10, 2004 10:58 pm
Post
by amsh76 » Thu Jun 02, 2005 11:15 am
Where are these empty lines..you can use head, tail or sed commands in unix...for removing those extra records.
Sainath.Srinivasan
Participant
Posts: 3337 Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom
Post
by Sainath.Srinivasan » Thu Jun 02, 2005 11:40 am
You can use grep to remove empty lines.
head and tail do not identify or remove empty lines.
amsh76
Charter Member
Posts: 118 Joined: Wed Mar 10, 2004 10:58 pm
Post
by amsh76 » Thu Jun 02, 2005 12:10 pm
Hi Sainath,
What i meant by head and tail is to move the data (w/o empty records) to another file..provided he knows the position of empty records.
ray.wurlod
Participant
Posts: 54607 Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:
Post
by ray.wurlod » Thu Jun 02, 2005 4:17 pm
sed (stream editor) is probably the fastest mechanism for removing empty lines, then pipe the result through wc -l
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
SriKara
Premium Member
Posts: 30 Joined: Wed Jun 01, 2005 8:40 am
Location: UK
Post
by SriKara » Sun Jun 05, 2005 12:56 am
You can remove the emtpy lines from a file and then count by using the below command.
cat filename | grep -v "^$" | wc -l
Sainath.Srinivasan
Participant
Posts: 3337 Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom
Post
by Sainath.Srinivasan » Sun Jun 05, 2005 4:07 am
You can reduce it saying
grep '^$' filename | wc -l
But make sure that an 'empty line' is a line 'with no value including blank spaces'. Otherwise your grep pattern will be different.
Prashantoncyber
Participant
Posts: 108 Joined: Wed Jul 28, 2004 7:15 am
Post
by Prashantoncyber » Sun Jun 05, 2005 4:29 am
Sainath,
Wht should be commands if there is Blank spaces in line and we dont want to count them as well?
thanks
Prashant
SriKara
Premium Member
Posts: 30 Joined: Wed Jun 01, 2005 8:40 am
Location: UK
Post
by SriKara » Sun Jun 05, 2005 6:02 am
To determine the number of lines with all spaces, the below command can be used.
cat filename | tr -s " " " " | grep '^ $' | wc -l
Dont know a better way to do this. Using tr command for a large file can by cumbersome though. Unix Gurus??
chulett
Charter Member
Posts: 43085 Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO
Post
by chulett » Sun Jun 05, 2005 7:21 am
I would think
sed would be a better answer. Takes all the normal regular expressions, so something like this perhaps?
Code: Select all
sed -e '/^$/d' -e '/^ *$/d' filename
Off the top of my head, this should
remove blank lines and lines that only consist of spaces. Exact syntax may vary from UNIX to UNIX. Output would be to standard out, so it could be used in the Filter of the Sequential File stage.
Of course, pipe the output to wc if you just want to count the results.
-craig
"You can never have too many knives" -- Logan Nine Fingers
Sainath.Srinivasan
Participant
Posts: 3337 Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom
Post
by Sainath.Srinivasan » Mon Jun 06, 2005 10:43 am
Sed is good. You can also do the same with grep and make it work more efficient.