Packet loss every 30.999 seconds

Thu Dec 27 21:23:52 PST 2007

On Fri, 28 Dec 2007, Bruce Evans wrote:

> In previous mail, you (Mark) wrote:
>
> # With FreeBSD 4 I was able to run a UDP data collector with rtprio set,
> # kern.ipc.maxsockbuf=20480000, then use setsockopt() with SO_RCVBUF
> # in the application.  If packets were dropped they would show up
> # with netstat -s as "dropped due to full socket buffers".
> # # Since the packet never makes it to ip_input() I no longer have
> # any way to count drops.  There will always be corner cases where
> # interrupts are lost and drops not accounted for if the adapter
> # hardware can't report them, but right now I've got no way to
> # estimate any loss.
>
> I tried using SO_RCVBUF in ttcp (it's an old version of ttcp that doesn't
> have an option for this).  With the default kern.ipc.maxsockbuf of 256K,
> this didn't seem to help.  20MB should work better :-) but I didn't try that.

I've now tried this.  With kern.ipc.maxsockbuf=20480000 (~20MB) an
SO_RCVBUF of 0x1000000 (16MB), the "socket buffer full lossage increases
from ~300 kpps (~47%) to ~450 kpps (70%) with tiny packets.  I think
this is caused by most accesses to the larger buffer being cache misses
-- since the system can't keep up, cache misses make it worse).

However, with 1500-byte packets, the larger buffer reduces the lossage
from 1 kpps in 76 kpps to precisely zero pps, at a cost of only a small
percentage of system overhead (~20Idle to ~18%Idle).

The above is with net.isr.direct=1.  With net.isr.direct=0, the loss is
too small to be obvious and is reported as 0, but I don't trust the
report.  ttcp's packet counts indicate losses of a few per million with
direct=0 but none with direct=1.  "while :; do sync; sleep 0.1" in the
background causes a loss of about 100 pps with direct=0 and a smaller
loss with direct=1.  Running the ttcp receiver at rtprio 0 doesn't make
much difference to the losses.

Bruce