This is a brief code snippet showing how to use command line tools to count the occurences of words matching a regex. In the following example, I am trying to list all the unique hits on the urls from the apache access log matching “/chart”.
I am using the GNU version of sed (the Linux and CygWin version), so substitute “-r” for “-E” if using the BSD version (as on OS X).
tail -80000 /var/log/apache2/access_log > tail.txt
egrep "POST /chart/" tail.txt \
|sed -r -n 's/.*chart\/(get[a-zA-Z]+).*/\1/p' \
| sort > sortert
for word in $(uniq < sortert); do
/bin/echo -n "$word : ";
grep $word sortert |wc -l;
done \
| sed -r -n 's/([a-zA-Z]+) : (.+)/\2\t\1/p' \
|sort -g









