Frequent network access freeze (in 7.0)
rwatson at FreeBSD.org
Wed Feb 20 11:35:35 UTC 2008
On Wed, 20 Feb 2008, Unga wrote:
> I'm running 7.0-PRERELEASE (RC2, dated 15/02/2008), compiled from sources on
> i386 machine (512MB RAM, 3.0GHz, tx0: <SMC EtherPower II 10/100>).
> Network access freezes very frequently. Cannot ping to any ip address. The
> only way to get networking working again is reboot.
> I'm having this problem on 7.0 ever since I tried it from BETA4. I have
> reported also to this list before but sadly nobody was interested on it.
> If somebody is interested to look into this problem, I could furnish with
> more detail and participate in testing.
This sort of problem frequently turns out to be a bug in a device driver or a
problem with interrupt probing/configuration, so my first guess would be a
problem with the if_tx driver. The usual starting diagnostics when ping fails
are to try to use tcpdump to determine whether it's receive or transmit
failing (or both). Quiet the network between two endpoints as much as you can
so you can avoid noise from making the dumps more complex, and dump arp and
icmp at both endpoints. Now try to ping from each end point to the other.
One potential source of confusion is that ping requires ARP to work, and ARP
can be a slightly confusing protocol as it usually resolves actively (query,
response) but sometimes it receives passive updates or extends existing
What you want to look for is a packet sent by one side that isn't received by
the other. You might find, for example, that your host receives packets fine,
but the packets it transmits are never received. This would be indicative of a
driver bug in which it fails to properly handle (for example) transmit queues
filling, and might only trigger under very high load. Or, you might find that
your host never receives anything the other side transmits, but can send fine.
This might be indicative of a driver bug involving the receive code, or a
problem with how interrupts are being handled more generally.
It looks like the last non-routine maintenance to the driver was done by
Maxime in about 2003; the more recent changes have all been updates to
newbus/busdma infrastructure, ifnet changes, locking changes, etc. I've CC'd
him as it sounds like he may have hardware... My advice would be to do the
above tests and see if you can narrow down whether it's transmit, receive, or
Robert N M Watson
University of Cambridge
More information about the freebsd-current