bizarre nfe(4) problem
Pyun YongHyeon
pyunyh at gmail.com
Sat Aug 11 02:26:53 PDT 2007
On Sat, Aug 11, 2007 at 06:22:22PM +0900, To Don Lewis wrote:
> On Fri, Aug 10, 2007 at 10:42:19PM -0700, Don Lewis wrote:
> > I've a rather strange nfe(4) problem that appears to be repeatable. I
> > recently started running -CURRENT on a older socket 754 motherboard with
> > the nForce3 chipset. Initially, I was running an SMP kernel, but I had
> > problems with sporadic "nfe0: watchdog timeout (missed Tx interrupts) --
> > recovering" problems that would intermittently cause the system to lose
> > network connectivity which it would recover from. The kernel was very
> > similar to GENERIC, with just the addition of "options DEBUG_VFS_LOCKS"
> > and the replacement of atapicd with atapicam.
> >
> > The nfe0 problem totally went away when I removed "options SMP" and
> > "device apic" from the kernel configuration, except under the following
> > very specific circumstances:
> >
> > A vncserver session using the GNOME desktop was started on the
> > system.
> >
> > There was no keyboard or mouse activity on the console for an
> > extended period of time, allowing the GNOME screen saver to kick
> > in and lock the screen.
> >
> > The system would run fine in this state for many hours, and would accept
> > incoming SMTP connections, etc.
> >
> > A remote vncclient makes a connection to the vncserver session
> > and the password was entired on the client.
> >
> > At this point the nfe0 interface would appear to go deaf. This might
> > happen before or slightly after the password dialog box appeared for the
> > vnc session. For a short while, the system would be able to transmit
> > TCP packets, ntp queries, etc., but it would not respond to any incoming
> > packets (ping, TCP connection requests, etc.). Eventually, the ARP cache
> > would time out and the only packets being transmitted would be ARP
> > requests and the occasional UDP broadcast from the samba server running
> > on the machine.
> >
> > Pressing any key on the (PS/2) keyboard would instantly bring the
> > network interface back to life. Examination of /var/log/messages showed
> > lots of "nfe0: watchdog timeout" messages for the entire time that nfe0
> > was not listening to the network.
> >
> > I've had this problem happen twice. Both times were after an extended
> > period of console inactivity. An incoming vnc connection is not
> > sufficient to trigger the problem if the console was recently active,
> > and even waiting for the GNOME screensaver to put the monitor in DPMS
> > power save mode before initiating the vnc connection does not appear to
> > be sufficient to trigger the problem.
> >
> > I believe that nfe0 was sharing an interrupt with one of the USB ports
> > when the kernel was compiled with "device apic", but it is not sharing
> > an interrupt without "device apic".
> >
> > Any thoughts on how to debug this problem?
> >
> >
> > # vmstat -i
> > interrupt total rate
> > irq0: clk 41903449 1000
> > irq1: atkbd0 39034 0
> > irq3: ohci0 5 0
> > irq7: ppc0 2 0
> > irq8: rtc 5362802 127
> > irq9: ohci1 ahc0+ 1963559 46
> > irq10: nfe0+ 225593 5
> ^^
> You have nfe0+ which indicates vmstat had run out of room to
> display somthing. I'm not sure but it's still sharing interrupt
> with other device?
It seems the interrupt is shared with atapci1.
>
> > irq11: drm0 2511908 59
> > irq12: psm0 332931 7
> > irq14: ata0 48 0
> > Total 52339331 1249
> >
>
> --
> Regards,
> Pyun YongHyeon
--
Regards,
Pyun YongHyeon
More information about the freebsd-current
mailing list