net80211 race conditions seen in -HEAD

PseudoCylon moonlightakkiy at yahoo.ca
Wed Jan 25 14:56:03 UTC 2012


> ------------------------------
>
> Message: 14
> Date: Sat, 21 Jan 2012 21:40:12 -0800
> From: Adrian Chadd <adrian at freebsd.org>
> Subject: net80211 race conditions seen in -HEAD
> To: freebsd-wireless at freebsd.org
> Message-ID:
>        <CAJ-Vmo=0Q7++oJtZ0jTPUD8q=FwkgJs5EoEXs+jt-XmMADEjtA at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hi,
>
> I've noticed some kernel panics in net80211/ath in -HEAD. It in all
> instances boils down to a now-invalid ieee80211_node - either it's
> partially allocated/copied, or it's been recently freed.
>
>
>
> This became increasingly obvious when doing DFS CAC, as the kernel was now
> changing the channel quite frequently on me whilst simulating/processing
> radar events. I've since found I can mostly reproduce it in the lab (when
> surrounded by ridiculous levels of RX intereference traffic, triggering all
> kinds of events) whilst creating/destroying VAPs.
>
> Now that I have debugging code in place (which as a side effect makes it
> very difficult now to cause a crash, let alone tickle the race condition)
> it's glaringly obvious what's going on.
>
> There's five contexts stuff can occur, at least in the net80211/ath case:
>
> * the swi (ie ath_intr(), ath_beacon_proc)
> * the ath taskqueue;
> * the net80211 taskqueue;
> * the ioctl() context, coming up from a userland process;
> * a callout running in the clock thread.
>
> Now, callouts should _hopefully_ be grabbing and releasing locks correctly.
> We've found a few spots where they weren't (leading to quite silly state
> races and crashes.)
>
> I'm going to ignore the obvious possible problems with multiple concurrent
> processes doing ioctl()s. l'm simply going to operate on the principle that
> the multiple-ioctl() path is fine.
>
> It seems that -obtaining- references to vap->iv_bss aren't locked. So in
> (say) ieee80211_sta_join1() the iv_bss node can be dereferenced and freed.
> If this is going on concurrently with (say) something going on in the
> net80211 taskqueue (eg a newstate call) then I _think_ it's possible for
> the ath_newstate() code to get a reference to vap->iv_bss simultaneously
> with it being freed in ieee80211_sta_join1() (or similar.) So the
> ath_newstate() code will be assigned a 'ni' that has just been freed.
>
> I've seen another crash in the net80211_ht code where it _looks_ like the
> bss node wasn't entirely setup - bsschan was 0xffff - so the kernel paniced
> hard there.
>
> This likely explains a lot of the "weird stuff" people have been reporting.
> I also think the bgscan race is related to this - I can't help but wonder
> if the bgscan callout/event is also coinciding with wpa_supplicant doing
> stuff, and a race condition ends up leaving the vap w/ the sta power save
> flag set.
>
> I don't yet have a solution to all of this - I just wanted to brain dump
> what I've seen thus far.
>

Here is my brain dump.

While ago usb wifi drivers had the slimier issue (race in 80211
stack). It's worth checking this rev.
http://svnweb.freebsd.org/base?view=revision&revision=212127

AK


More information about the freebsd-wireless mailing list