RFT: if_ath HAL refactoring
PseudoCylon
moonlightakkiy at yahoo.ca
Thu Sep 23 00:35:39 UTC 2010
----- Original Message ----
> From: Rui Paulo <rpaulo at FreeBSD.org>
> To: PseudoCylon <moonlightakkiy at yahoo.ca>
> Cc: Bernhard Schmidt <bschmidt at techwires.net>; freebsd-current at freebsd.org;
>Adrian Chadd <adrian at freebsd.org>
> Sent: Wed, September 22, 2010 4:48:14 PM
> Subject: Re: RFT: if_ath HAL refactoring
>
> On 22 Sep 2010, at 23:42, PseudoCylon wrote:
>
> >
> >
> >
> >
> > ----- Original Message ----
> >> From: Bernhard Schmidt <bschmidt at techwires.net>
> >> To: freebsd-current at freebsd.org
> >> Cc: PseudoCylon <moonlightakkiy at yahoo.ca>; Adrian Chadd
><adrian at freebsd.org>
> >> Sent: Wed, September 22, 2010 12:09:36 AM
> >> Subject: Re: RFT: if_ath HAL refactoring
> >>
> >> On Wednesday, September 22, 2010 06:04:49 PseudoCylon wrote:
> >>> ----- Original Message ----
> >>>
> >>>> From: Adrian Chadd <adrian at freebsd.org>
> >>>> To: PseudoCylon <moonlightakkiy at yahoo.ca>
> >>>> Cc: freebsd-current at freebsd.org
> >>>> Sent: Tue, September 21, 2010 7:04:37 AM
> >>>> Subject: Re: RFT: if_ath HAL refactoring
> >>>>
> >>>> On 21 September 2010 11:58, PseudoCylon <moonlightakkiy at yahoo.ca>
> > wrote:
> >>>>> Just in case anyone wonders, I've added 11n support to run(4) (USB
> >>>>> NIC). http://gitorious.org/run/run/trees/11n_beta2
> >>>>>
> >>>>> It still has some issues,
> >>>>>
> >>>>> * doesn't work well with atheros chips
> >>>>>
> >>>>> * HT + AP + bridge = Tx may stall (seems OK with nat)
> >>>>>
> >>>>> So, use it at your own discretion.
> >>>>
> >>>> Want to put together a patch?
> >>>
> >>> sure!
> >>>
> >>>> Does it introduce issues in the non-11n case?
> >>>
> >>> No, only in 11n mode.
> >>>
> >>> What I have found so far is that Ralink's driver checks MAC address of
> >>> other end and identify atheros chip by oui. Then, sets special prot mode
> >>> for it. Does this ring a bell?
> >>
> >> Are your sure that this is based on the actual MAC addresses? Atheros
>drivers
>
> >
> >> tend to announce additional capabilities in beacons and probe responses.
> >
> > It is based on the actual MAC, but it is Broadcom's oui (00904c). sorry.
> >
> >>
> >>> Has node lock in ieee80211_node_timeout() cased dead lock in HT + AP +
> >>> bridge?
> >>
> >> I'm not aware of any issues there, though, I'm not very familiar with HT
>use
>
> >> cases.
> >
> > I attached witness messages. Those 2 LORs always happen together before
> > deadlock. I hooked iv_input() and unlock/lock node lock to avoid deadlock.
>(I
>
> > don't know if it's safe.)
> >
> > I wonder if this is run(4) specific problem.
> >
> >
> > AK
> >
> >
> > lock order reversal:
> > 1st 0xffffff8000a267d0 run0_node_lock (run0_node_lock) @
> > /usr/src/sys/net80211/ieee80211_node.c:1360
> > 2nd 0xffffff0001716818 if_bridge (if_bridge) @
> > /usr/src/sys/net/if_bridge.c:2184
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> > _witness_debugger() at _witness_debugger+0x2e
> > witness_checkorder() at witness_checkorder+0x81e
> > _mtx_lock_flags() at _mtx_lock_flags+0x78
> > bridge_input() at bridge_input+0x7e
> > ether_input() at ether_input+0x143
> > hostap_input() at hostap_input+0x4ea
> > ampdu_rx_flush() at ampdu_rx_flush+0x5e
> > ieee80211_ht_node_age() at ieee80211_ht_node_age+0x7b
> > ieee80211_node_timeout() at ieee80211_node_timeout+0x2dc
> > softclock() at softclock+0x2a0
> > intr_event_execute_handlers() at intr_event_execute_handlers+0x66
> > ithread_loop() at ithread_loop+0xb2
> > fork_exit() at fork_exit+0x12a
> > fork_trampoline() at fork_trampoline+0xe
> > --- trap 0, rip = 0, rsp = 0xffffff8000052d30, rbp = 0 ---
> >
> > lock order reversal:
> > 1st 0xffffff8000a267d0 run0_node_lock (run0_node_lock) @
> > /usr/src/sys/net80211/ieee80211_node.c:1360
> > 2nd 0xffffffff80a186c8 tcp (tcp) @ /usr/src/sys/netinet/tcp_input.c:498
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> > _witness_debugger() at _witness_debugger+0x2e
> > witness_checkorder() at witness_checkorder+0x81e
> > _rw_rlock() at _rw_rlock+0x5f
> > tcp_input() at tcp_input+0xa58
> > ip_input() at ip_input+0xbc
> > netisr_dispatch_src() at netisr_dispatch_src+0xb8
> > ether_demux() at ether_demux+0x17d
> > ether_input() at ether_input+0x175
> > hostap_input() at hostap_input+0x4ea
> > ampdu_rx_flush() at ampdu_rx_flush+0x5e
> > ieee80211_ht_node_age() at ieee80211_ht_node_age+0x7b
> > ieee80211_node_timeout() at ieee80211_node_timeout+0x2dc
> > softclock() at softclock+0x2a0
> > intr_event_execute_handlers() at intr_event_execute_handlers+0x66
> > ithread_loop() at ithread_loop+0xb2
> > fork_exit() at fork_exit+0x12a
> > fork_trampoline() at fork_trampoline+0xe
> > --- trap 0, rip = 0, rsp = 0xffffff8000052d30, rbp = 0 ---
>
> Can you explain why the run0_node_lock is locked ? I don't have the code at
>hand..
>
> Regards,
> --
> Rui Paulo
>
>
I don't know why, but I know where.
run0_node_lock is locked at ieee80211_node.c:1917
ieee80211_node_timeout() -> ieee80211_timeout_stations()
http://fxr.watson.org/fxr/source/net80211/ieee80211_node.c?im=bigexcerpts#L1917
ieee80211_node.c:1360 (one witness reports)
hostap_input() -> hostap_deliver_data() ->ieee80211_find_vap_node() -> lock
@ ieee80211_node.c:1360 (I think it's recursed.)
and
run(4) calls ieee80211_iterate_nodes() once/sec for ratectl. (locks @
ieee80211_node.c:2138)
Each one has own reason to lock, I guess.
My workaround.
http://gitorious.org/run/run/blobs/11n_beta2/dev/usb/wlan/if_run.c :1865
unlocks one locked in ieee80211_timeout_stations(). This one is held for long
time.
Hope this is what you want to know.
AK
More information about the freebsd-current
mailing list