Wednesday, July 30, 2008

Count number of occurrences of a word

How can you count number of occurrences of a word in a file? Say, I want to count how many times ring is present in a file.

How about using the -c option of grep?
% grep -c ring /foo/bar

And what if the file contains words such as string and rings? That would break our count of ring.
Using the -w option of grep solves this problem
% grep -cw ring /foo/bar

What if the word ring is present more than once in a line? grep would produce incorrect count in this case, since it counts the number of lines in which the pattern is found. grep is not sufficient for this job. We need something that can count multiple occurrences of a word in a line.

To get the exact count:
#!/usr/bin/perl
# search_word.pl
my $search_this = shift or exit 1;
my $count = 0;
while (<>) {
    while (m/\b$search_this\b/g) {
        $count++;
    }
}
if ($count == 0) {
    print $ARGV . "does not contain " . $search_this . "\n";
}
else {
    print $ARGV . "contains " . $search_this . " " . $count . (($count == 1) ? " time\n" : " times\n");
}


Run this perl script as:
% perl search_word.pl ring filename

And what if I don't want to use Perl? Well, then this should work for you:
% grep -w -o ring /foo/bar | wc -l

No comments:

Post a Comment