iwn firmware instability with an up-to-date stable kernel

Bernhard Schmidt bschmidt at techwires.net
Sat Apr 24 07:50:33 UTC 2010


On Sat, Apr 24, 2010 at 12:45:14AM -0700, Garrett Cooper wrote:
> On Sat, Apr 24, 2010 at 12:34 AM, Bernhard Schmidt
> <bschmidt at techwires.net> wrote:
> > On Fri, Apr 23, 2010 at 11:27:32PM -0700, Garrett Cooper wrote:
> >> On Fri, Apr 23, 2010 at 10:08 PM, Brandon Gooch
> >> <jamesbrandongooch at gmail.com> wrote:
> >> > On Sat, Apr 24, 2010 at 4:59 AM, Garrett Cooper <yanefbsd at gmail.com> wrote:
> >> >> On Fri, Apr 23, 2010 at 9:42 PM, Garrett Cooper <yanefbsd at gmail.com> wrote:
> >> >>> On Fri, Apr 23, 2010 at 8:05 PM, Brandon Gooch
> >> >>> <jamesbrandongooch at gmail.com> wrote:
> >> >>>> 2010/4/23 Garrett Cooper <yanefbsd at gmail.com>:
> >> >>>>> 2010/4/23 Garrett Cooper <yanefbsd at gmail.com>:
> >> >>>>>> 2010/4/18 Olivier Cochard-Labbé <olivier at cochard.me>:
> >> >>>>>>> 2010/4/18 Bernhard Schmidt <bschmidt at techwires.net>:
> >> >>>>>>>> Are you able to reproduce this on demand? As in type a few commands and
> >> >>>>>>>> the firmware error occurs?
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>> No, I'm not able to reproduce on demand this problem.
> >> >>>>>>
> >> >>>>>> I'm seeing similar issues on occasion with my Lenovo as well:
> >> >>>>>>
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: firmware error log:
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: error type      =
> >> >>>>>> "NMI_INTERRUPT_WDG" (0x00000004)
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: program counter = 0x0000046C
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: source line     = 0x000000D0
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: error data      = 0x0000000207030000
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: branch link     = 0x00008370000004C2
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: interrupt link  = 0x000006DA000018B8
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: time            = 4287402440
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: driver status:
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  0: qid=0  cur=1   queued=0
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  1: qid=1  cur=0   queued=0
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  2: qid=2  cur=0   queued=0
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  3: qid=3  cur=36  queued=0
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  4: qid=4  cur=123 queued=0
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  5: qid=5  cur=0   queued=0
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  6: qid=6  cur=0   queued=0
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  7: qid=7  cur=0   queued=0
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  8: qid=8  cur=0   queued=0
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  9: qid=9  cur=0   queued=0
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 10: qid=10 cur=0   queued=0
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 11: qid=11 cur=0   queued=0
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 12: qid=12 cur=0   queued=0
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 13: qid=13 cur=0   queued=0
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 14: qid=14 cur=0   queued=0
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 15: qid=15 cur=0   queued=0
> >> >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: rx ring: cur=8
> >> >>>>>>
> >> >>>>>> This may be because the system was under load (I was installing a port
> >> >>>>>> shortly before the connection dropped). I'll try poking at this
> >> >>>>>> further because it's going to be an annoying productivity loss :/.
> >> >>>>>
> >> >>>>>    Sorry... should have included more helpful details.
> >> >>>>> Thanks,
> >> >>>>> -Garrett
> >> >>>>>
> >> >>>>> dmesg:
> >> >>>>>
> >> >>>>> iwn0: <Intel(R) PRO/Wireless 4965BGN> mem 0xdf2fe000-0xdf2fffff irq 17
> >> >>>>> at device 0.0 on pci3
> >> >>>>> iwn0: MIMO 2T3R, MoW1, address 00:1d:e0:7d:9f:c7
> >> >>>>> iwn0: [ITHREAD]
> >> >>>>> iwn0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
> >> >>>>> iwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
> >> >>>>> iwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps
> >> >>>>> 24Mbps 36Mbps 48Mbps 54Mbps
> >> >>>>>
> >> >>>>> pciconf -lv snippet:
> >> >>>>>
> >> >>>>> iwn0 at pci0:3:0:0:        class=0x028000 card=0x11108086 chip=0x42308086
> >> >>>>> rev=0x61 hdr=0x00
> >> >>>>>    vendor     = 'Intel Corporation'
> >> >>>>>    device     = 'Intel Wireless WiFi Link 4965AGN (Intel 4965AGN)'
> >> >>>>>    class      = network
> >> >>>>> cbb0 at pci0:21:0:0:       class=0x060700 card=0x20c617aa chip=0x04761180
> >> >>>>> rev=0xba hdr=0x02
> >> >>>>>
> >> >>>>> uname -a:
> >> >>>>>
> >> >>>>> $ uname -a
> >> >>>>> FreeBSD garrcoop-fbsd.cisco.com 8.0-STABLE FreeBSD 8.0-STABLE #0
> >> >>>>> r207006: Wed Apr 21 13:18:44 PDT 2010
> >> >>>>> root at garrcoop-fbsd.cisco.com:/usr/obj/usr/src/sys/LAPPY_X86  i386
> >> >>>>
> >> >>>> I'm actually looking at this right now. For me, it's actually
> >> >>>> happening when my machine stays on overnight (or for long periods of
> >> >>>> time, idle).
> >> >>>>
> >> >>>> Also, it seems to be causing the kernel to panic, although I'm now
> >> >>>> wondering if the Machine Check Architecture is somehow catching this
> >> >>>> device error and causing an exception (hw.mca.enabled=1)(?) -- not
> >> >>>> possible, right ???
> >> >>>>
> >> >>>> Whatever the case, I can't seem to get the firmware error to occur
> >> >>>> with iwn(4) debugging or wlandebug options enabled, so who knows
> >> >>>> exactly what leads to this.
> >> >>>>
> >> >>>> I know Bernhard has worked hard on this driver, it's a shame that this
> >> >>>> freaky bug has bit us all now, without leaving many clues :(
> >> >>>>
> >> >>>> I've attached a textdump for posterity if nothing else :)
> >> >>>
> >> >>>    Connectivity appears to be shoddy in my neck of the woods (kind of
> >> >>> ironic... but meh). Just running buildworld, buildkernel, then doing a
> >> >>> tcpdump in parallel causes the pseudo device to go up and down a lot.
> >> >>> I assume this isn't standard behavior?
> >> >>>    Just for reference buildworld was started shortly after 19:39:05,
> >> >>> and it finished at 21:29. The interface has also gone up and down once
> >> >>> since then while the system's been basically idle.
> >> >>
> >> >>    Hmmm... I'm seem to be in an excellent position to reproduce this
> >> >> issue. I've reproduced it twice by merely bringing the interface up
> >> >> and down several times using:
> >> >>
> >> >> ifconfig_wlan0="WPA DHCP"
> >> >>
> >> >>    instead of my usual:
> >> >>
> >> >> ifconfig_wlan0="WPA ssid <base-station-id1> DHCP"
> >> >>
> >> >>    Maybe others who are experiencing the issue should try that? I'll
> >> >> do more testing when I get home...
> >
> > How did you do that? Reloading the module, or with ifconfig?
> 
> /etc/rc.d/netif restart , which does the ifconfig operations (no
> module change occurred AFAIK, but wlan0 did of course do some
> device_printf's when it was associating itself with iwn(4)).

Can you do ps xa | grep wpa? Just wondering if wpa_supplicant gets
started twice.

> >> >
> >> > My rc.conf is:
> >> >
> >> > ifconfig_wlan0="WPA DHCP"
> >> >
> >> > ...as well, although I haven't tried manually taking the interface
> >> > down and bringing it back up.
> >>
> >> Hmmm... that is interesting. I wish I could do that, but it seems to
> >> be alluding my grasp right now. The driver just kind of freaks out
> >> with a bunch of SSIDs, one being my target SSID, a bunch of NUL string
> >> ones, and then finally it just croaks. I need to figure out whether or
> >> not the SSIDs are valid when I boot it up at my desk again.
> >>
> >> > Are you waiting for the device to associate and begin passing traffic
> >> > before you each up/down cycle?
> >>
> >> I was, but I'm not sure whether or not the Ajax pieces in GMail were.
> >> I'll try some more rudimentary tests when I get back to work on Monday
> >> in that environment, but I need to try out other things at home as
> >> well in the meantime.
> 
> Thanks,
> -Garrett

-- 
Bernhard


More information about the freebsd-stable mailing list