kern/161899: Repeating RTM_MISS packets causing high CPU load for
ntpd
Steven Hartland
killing at multiplay.co.uk
Sat Oct 22 13:20:09 UTC 2011
>Number: 161899
>Category: kern
>Synopsis: Repeating RTM_MISS packets causing high CPU load for ntpd
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Sat Oct 22 13:20:08 UTC 2011
>Closed-Date:
>Last-Modified:
>Originator: Steven Hartland
>Release: 8.2-STABLE
>Organization:
Multiplay
>Environment:
FreeBSD db4.multiplay.co.uk 8.2-STABLE FreeBSD 8.2-STABLE #16: Tue Oct 4 09:53:17 UTC 2011 root at db3-old.multiplay.co.uk:/usr/obj/usr/src/sys/MULTIPLAY amd64
>Description:
We just updated a machine to 8-STABLE and I've noticed
that ntpd is using notible amounts of CPU 5-7% which
is very high for such a trivial daemon.
8.2-STABLE FreeBSD 8.2-STABLE #16: Tue Oct 4 09:53:17 UTC 2011
truss indicates its constantly checking and reading
from a socket
0.047297485 select(29,{20 21 22 23 24 25 26 27 28},0x0,0x0,0x0) = 1 (0x1)
0.047513160 clock_gettime(0,{1317770389.969538247 }) = 0 (0x0)
0.047604515 select(29,{20 21 22 23 24 25 26 27 28},0x0,0x0,{0.000000 }) = 1 (0x1)
0.047668212 read(28,"\M-8\0\^E\a\0\0\0\0@\0\0\0\^A\0"...,5120) = 184 (0xb8)
0.049395293 select(29,{20 21 22 23 24 25 26 27 28},0x0,0x0,0x0) = 1 (0x1)
0.049503689 clock_gettime(0,{1317770389.971526820 }) = 0 (0x0)
0.049606219 select(29,{20 21 22 23 24 25 26 27 28},0x0,0x0,{0.000000 }) = 1 (0x1)
0.049669916 read(28,"\M-8\0\^E\a\0\0\0\0@\0\0\0\^A\0"...,5120) = 184 (0xb8)
0.049809882 select(29,{20 21 22 23 24 25 26 27 28},0x0,0x0,0x0) = 1 (0x1)
..
running with debug enabled it sits looping outputting:-
routing message op = 7: ignored
routing message op = 7: ignored
routing message op = 7: ignored
routing message op = 7: ignored
routing message op = 7: ignored
routing message op = 7: ignored
routing message op = 7: ignored
..
It seems socket 28 is a duplicate of an internal routing socket
as seen here in the trace:-
0.044544269 socket(PF_ROUTE,SOCK_RAW,0) = 4 (0x4)
0.044595394 fcntl(4,F_DUPFD,0x14) = 28 (0x1c)
0.044645960 close(4) = 0 (0x0)
0.044695968 fcntl(28,F_SETFL,O_NONBLOCK) = 0 (0x0)
Now this looks like its RTM_MISS as defined:-
sys/net/route.h:#define RTM_MISS 0x7 /* Lookup failed on this address */
So the question was why is PF_ROUTE socket constantly
spamming RTM_MISS?
route -n monitor on this machines shows:-
got message of size 184 on Tue Oct 4 23:46:36 2011
RTM_MISS: Lookup failed on this address: len 184, pid: 0, seq 0, errno 0, flags:<DONE>
locks: inits:
sockaddrs: <DST>
::A.B.C.D
got message of size 184 on Tue Oct 4 23:46:36 2011
RTM_MISS: Lookup failed on this address: len 184, pid: 0, seq 0, errno 0, flags:<DONE>
locks: inits:
sockaddrs: <DST>
::A.B.C.D
This seems very much like the following pr which was fixed:-
"Remove a bogusly introduced rtalloc_ign() in rev. 1.335/SVN 178029,
generating an RTM_MISS for every IP packet forwarded making user space
routing daemons unhappy":-
http://www.freebsd.org/cgi/query-pr.cgi?pr=124540
The box is doing no routing, its fairly basic install with
1 main IP on em0 + 1 alias + gw addres and 1 private ip on em1.
Its running mysql and thats about it.
Additional discussion can be found in the the thread topic freebsd-stable list
"serious packet routing issue causing ntpd high load?"
>How-To-Repeat:
According to Vlad Galu, which Li Qing confirms on the original discussion thread:
"serious packet routing issue causing ntpd high load?"
1. Start listening for routing messages on a non-default FIB, e.g.
setfib 2 route monitor
2. Add any static route within that FIB.
3. The machine I run the test on is a heavy DNS client, it generates a
few dozen requests per second. The monitor process starts getting
RTM_MISS messages for each outgoing DNS request (the dst sockaddr is
the same as my first resolv.conf entry, seq is always 0).
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list