kern/139117: [lagg] + wlan boot timing (EBUSY)

David Horn dhorn2000 at gmail.com
Thu Sep 24 16:00:12 UTC 2009


>Number:         139117
>Category:       kern
>Synopsis:       [lagg] + wlan boot timing (EBUSY)
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Sep 24 16:00:11 UTC 2009
>Closed-Date:
>Last-Modified:
>Originator:     David Horn
>Release:        8.0 RC1
>Organization:
>Environment:
FreeBSD lagg 8.0-RC1 FreeBSD 8.0-RC1 #11 r197417: Wed Sep 23 01:05:15
EDT 2009     root at lagg:/usr/obj/usr/src/sys/GENERIC  amd64
>Description:
I have been trying to track down a problem with my lagg connection
sometimes not properly enabling wlan as fallback on boot.  It would
work properly about 60% of the time.  The other times, it would fail
with SIOCSLAGGPORT: Device busy

Here is the relevant rc.conf entries:

ifconfig_bfe0="up"
wlans_iwn0="wlan0"
ifconfig_wlan0="WPA"
ifconfig_iwn0="ether 00:1c:23:98:2c:5d"
cloned_interfaces="lagg0"
ipv6_network_interfaces="lagg0"
ifconfig_lagg0="laggproto failover laggport bfe0 laggport wlan0 DHCP"
ipv6_enable="YES"

So, I turned on some logging of all ifconfig commands with timestamps
and stdout/stderr/returncode, and noticed this:

Wed Sep 23 01:39:56 EDT 2009 ifconfig: lagg0 create ;
;; Wed Sep 23 01:39:56 EDT 2009 lagg0 rc='0' end.
Wed Sep 23 01:39:56 EDT 2009 ifconfig: -l ;
iwn0 bfe0 fwe0 fwip0 lo0 lagg0
;; Wed Sep 23 01:39:56 EDT 2009 -l rc='0' end.
Wed Sep 23 01:39:56 EDT 2009 ifconfig: -l ;
iwn0 bfe0 fwe0 fwip0 lo0 lagg0
;; Wed Sep 23 01:39:56 EDT 2009 -l rc='0' end.
Wed Sep 23 01:39:56 EDT 2009 ifconfig: lo0 inet 127.0.0.1 ;
;; Wed Sep 23 01:39:56 EDT 2009 lo0 rc='0' end.
Wed Sep 23 01:39:56 EDT 2009 ifconfig: lo0 up ;
;; Wed Sep 23 01:39:56 EDT 2009 lo0 rc='0' end.
Wed Sep 23 01:39:56 EDT 2009 ifconfig: iwn0 ether 00:1c:23:98:2c:5d ;
;; Wed Sep 23 01:39:56 EDT 2009 iwn0 rc='0' end.
Wed Sep 23 01:39:56 EDT 2009 ifconfig: iwn0 up ;
;; Wed Sep 23 01:39:56 EDT 2009 iwn0 rc='0' end.
Wed Sep 23 01:39:56 EDT 2009 ifconfig: wlan0 create wlandev iwn0 ;
;; Wed Sep 23 01:39:56 EDT 2009 wlan0 rc='0' end.
Wed Sep 23 01:39:56 EDT 2009 ifconfig: wlan0 ;
wlan0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
       ether 00:1c:23:98:2c:5d
       media: IEEE 802.11 Wireless Ethernet autoselect (autoselect)
       status: no carrier
       ssid "" channel 1 (2412 Mhz 11b)
       country US authmode OPEN privacy OFF txpower 14 bmiss 10 scanvalid 60
       wme bintval 0
;; Wed Sep 23 01:39:56 EDT 2009 wlan0 rc='0' end.
Wed Sep 23 01:39:57 EDT 2009 ifconfig: lagg0 laggproto failover
laggport bfe0 laggport wlan0 ;
ifconfig.real: SIOCSLAGGPORT: Device busy
;; Wed Sep 23 01:39:57 EDT 2009 lagg0 rc='1' end.

So, I started looking at the /sys/net/if_lagg.c source, and found the
EBUSY response cases:

This one

/* New lagg port has to be in an idle state */
       if (ifp->if_drv_flags & IFF_DRV_OACTIVE)
               return (EBUSY);

seems to be the culprit, but unfortunately, I'm not familiar enough
with the code to take this much further.  I did build a kernel without
this check, and everything seems to be fixed, but this is obviously
not a real fix to the problem.  So, I would say the fact that
wpa_supplicant is talking to wlan0 (trying to scan/associate/auth)
while lagg is trying to add wlan0 to the portlist is the timing issue.

I confirmed this behavior as follows:

ifconfig wlan0 destroy
ifconfig lagg0 destroy
ifconfig lagg0 create
ifconfig wlan0 create wlandev iwn0  & ; ifconfig lagg0 laggproto
failover laggport bfe0 laggport wlan0
results in:
ifconfig: SIOCSLAGGPORT: Device busy

Someone more clueful than me know of a correct way to fix this
contention issue ?


It is possible that the following reports are related:

http://lists.freebsd.org/pipermail/freebsd-current/2009-June/008776.html
http://lists.freebsd.org/pipermail/freebsd-current/2009-June/007641.html
>How-To-Repeat:
After playing around with this issue, it is 100% reproducible at boot time (at least for me) when you create a lagg group with more than 1 wlan interface. e.g.:

ifconfig_bfe0="up"
wlans_iwn0="wlan0"
wlans_ural0="wlan1"
wlans_ath0="wlan2"
ifconfig_wlan0="WPA"
ifconfig_wlan1="WPA"
ifconfig_wlan2="WPA"
ifconfig_iwn0="ether 00:1c:23:98:2c:5d"
ifconfig_ural0="ether 00:1c:23:98:2c:5d"
ifconfig_ath0="ether 00:1c:23:98:2c:5d"
cloned_interfaces="lagg0"
ipv6_network_interfaces="lagg0"
ifconfig_lagg0="laggproto failover laggport bfe0 laggport wlan0 laggport wlan1 laggport wlan2 DHCP"
ipv6_enable="YES"


>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list