Regular Expression Trouble

Wed Aug 27 15:29:06 UTC 2008

Hi Martin.

On Wed, Aug 27, 2008 at 08:25:02AM -0500, Martin McCormick wrote:
> 
> Aug 26 20:45:36 dh1 dhcpd: DHCPACK on 10.198.67.116 to 00:12:f0:88:97:d6
> (peaster-laptop) via 10.198.71.246 
> 
> That was one line broken to aid in emailing, but that's what
> types of lines are involved. The MAC appears at different field
> locations depending on the type of event being logged so awk is
> perfect for certain types of lines, but it misses others and no
> one awk expression gets them all.

While I agree with others that awk should be used with explicit
recognition of the particular lines, you can still snatch everything
with sed if you want to.  In FreeBSD, sed supported extended regex, so:

	sed -nE 's/.*([0-9a-f]{2}(:[0-9a-f]{2}){5}).*/\1/p'

The "-n" option tells sed not to print the line unless instructed to
explicitely, and the "p" modifier at the end is that instruction.  As
for the regex ... well, that's straightforward enough.

> 	This is an attempt to isolate every MAC address that
> appears and then sort and count them to see who is having
> trouble or, in some cases, is causing trouble.

Then you still may want to use awk for some of that...

	cat /var/log/dhcpd.log | \
	sed -nE 's/.*([0-9a-f]{2}(:[0-9a-f]{2}){5}).*/\1/p' | \
	awk '
	 { a[$1]++; }
	 END {
	  for(i in a){
	   printf("%7.0f\t%s\n", a[i], i);
	  }
	 }
	' | sort -nr

You can join the lines into a single command line if you like, or toss
it as-is into a tiny shell script.  Awk is forgiving about whitespace.

You should theoretically be able to feed the same regex to awk, but I've
found that awk's eregex support sometimes doesn't work as I'd expect.

Hope this helps.

p

-- 
  Paul Chvostek                                             <paul at it.ca>
  it.canada                                            http://www.it.ca/