6.2 SHOWSTOPPER - em completely unusable on 6.2

Volker volker at vwsoft.com
Wed Sep 27 14:18:31 PDT 2006


On 37378-12-23 20:59, Patrick M. Hausen wrote:
> Hello!
> 
>> Well, the best I can say at the moment is, "Wow."  =-(  I guess the 
>> thing to do here is to figure out if the problem lies with the em 
>> interrupt handler not getting run, or the taskqueue not getting run.
> 
> I helped Pyun with some debugging by providing ssh access to
> a machine showing the (seemingly) same problem.
> 
> At first he thought the interrupt handler of the em driver was
> the culprit, but we applied quite a few patches and tested
> afterwards - seems like the driver is not the cause.
> 
> On -stable occasionally other people complained about very similar
> looking problems with bge and other drivers. My guess is, though 
> I'm not a kernel developer, just an experienced admin, that
> em stands out as problematic just by coincidence. Certain onboard
> network components tend to come with certaiin chipsets and certain
> architectures.
> 
> So, Pyun suggested it was a problem with the taskqueue that was
> introduced some time between 6.0 and 6.1.
> 
> With my system (Tyan GT20 B5161G20) the problem shows when there
> is heavy disk and cpu activity, like "make buildworld".
> I made sure that the em interface doesn't share an interrupt
> with the SATA controller. When the problem occurs, I get the
> well known "watchdog timeout" messages and then the system's
> network activity over that interface freezes completely for
> a couple of minutes.
> Usually the system recovers after a while without reboot or
> other measures.
> 

Strange... I've seen exactly that on a (recent) RELENG_6 box but
using a dirty old USB 1.1 NIC (aue). I've seen DOWN and UP messages
(mostly while rebuilding kernel + world + ports) on the console all
the time (but did not care about).

The machine in question is an Athlon XP-64 Socket 939, Asus A8N-VM
CSM. The USB ethernet NIC is a low budget ADMtek device. My
observations are probably not related to your issues but maybe a
sign of not really being a driver issue or not GigE related.

Greeting,

Volker


More information about the freebsd-stable mailing list