VIMAGE + kldload wlan + kldload wtap panic

Marko Zec zec at fer.hr
Tue Mar 6 20:51:34 UTC 2012


On Tuesday 06 March 2012 21:29:32 Monthadar Al Jaberi wrote:
> On Tue, Mar 6, 2012 at 9:22 PM, Marko Zec <zec at fer.hr> wrote:
> > On Tuesday 06 March 2012 21:13:00 Monthadar Al Jaberi wrote:
> >> I am confused so whats the difference between having wlan in kernel
> >> config or not? Cuase that seems the reason why we panic... linker
> >> problems?
> >
> > Its not impossible.
> >
> > Have you tried to do CURVNET_SET(ss->ss_vap->iv_ifp->if_vnet) on entry to
> > scan_task() as I suggested earlier in this thread?
>
> this is the code I added:
> diff --git a/sys/net80211/ieee80211_scan.c b/sys/net80211/ieee80211_scan.c
> index 5c1e3d9..bd20653 100644
> --- a/sys/net80211/ieee80211_scan.c
> +++ b/sys/net80211/ieee80211_scan.c
> @@ -850,6 +850,7 @@ scan_task(void *arg, int pending)
>         int scandone = 0;
>
>         IEEE80211_LOCK(ic);
> +      CURVNET_SET((struct ieee80211_scan_state *)
> ss->ss_vap->iv_ifp->if_curvnet);

The explicit cast is redundant here, I only used it earlier to clarify the 
type of variable ss...

Anyhow, by looking again at the backtrace you posted at the begining of this 
thread, it's apparent that the code calls into rt_dispatch() and fails at 
line 1494:

#ifdef VIMAGE
        if (V_loif)
                m->m_pkthdr.rcvif = V_loif;
	...

because curvnet is not set, or is pointing somewhere where it shouldn't.

Can you post the output of show pcpu and show vnets from ddb> prompt when you 
get the panic?

Have you actually booted a freshly built kernel?  I pretty sure it's 
impossible for CURVNET_SET() to succeed with a NULL or an incorrect vnet 
pointer as an argument when the kernel is buitl with VNET_DEBUG, so I don't 
see how it may be possible for a patched scan_task() to call into 
rt_dispatch() with wrong curvnet.

Cheers,

Marko

>         if (vap == NULL || (ic->ic_flags & IEEE80211_F_SCAN) == 0 ||
>             (SCAN_PRIVATE(ss)->ss_iflags & ISCAN_ABORT)) {
>                 /* Cancelled before we started */
> @@ -1004,6 +1005,7 @@ scan_task(void *arg, int pending)
>                 ss->ss_ops->scan_restart(ss, vap);      /* XXX? */
>                 ieee80211_runtask(ic, &SCAN_PRIVATE(ss)->ss_scan_task);
>                 IEEE80211_UNLOCK(ic);
> +               CURVNET_RESTORE();
>                 return;
>         }
>
> @@ -1043,6 +1045,7 @@ done:
>         SCAN_PRIVATE(ss)->ss_iflags &= ~(ISCAN_CANCEL|ISCAN_ABORT);
>         ss->ss_flags &= ~(IEEE80211_SCAN_ONCE | IEEE80211_SCAN_PICK1ST);
>         IEEE80211_UNLOCK(ic);
> +       CURVNET_RESTORE();
>  #undef ISCAN_REP
>  }
>
> same panic...
>
> > Cheers,
> >
> > Marko
> >
> >> On Tue, Mar 6, 2012 at 9:06 PM, Adrian Chadd <adrian.chadd at gmail.com> 
wrote:
> >> > Hi,
> >> >
> >> > The trouble here is that net80211 has quite a few other contexts that
> >> > things are called from:
> >> >
> >> > * driver taskqueue;
> >> > * net80211 taskqueue;
> >> > * driver callouts;
> >> > * net80211 callouts;
> >> > * ioctls via net80211.
> >> >
> >> > That's in parallel with frame tx/rx and device ioctls.
> >> >
> >> > I don't personally have the time to go through net80211 and driver(s)
> >> > at the moment to figure out what's going on. Since ath(4) does a bunch
> >> > of frame processing in taskqueue context (and I'm trying to eliminate
> >> > frame processing in _callout_ context, ew..) things can potentially
> >> > get a bit hairy.
> >> >
> >> >
> >> > Adrian
> >> >
> >> > On 6 March 2012 11:59, Marko Zec <zec at fer.hr> wrote:
> >> >> On Tuesday 06 March 2012 20:49:38 Monthadar Al Jaberi wrote:
> >> >>> I added VNET_DEBUG and noticed this warning (original scan_task
> >> >>> code):
> >> >>>
> >> >>> CURVNET_SET() recursion in sosend() line 1350, prev in
> >> >>> kern_kldload() 0xfffffe0002202c40 -> 0xfffffe0002202c40
> >> >>> KDB: stack backtrace:
> >> >>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> >> >>> kdb_backtrace() at kdb_backtrace+0x37
> >> >>> sosend() at sosend+0xbd
> >> >>> clnt_vc_call() at clnt_vc_call+0x3e6
> >> >>> clnt_reconnect_call() at clnt_reconnect_call+0xf5
> >> >>> newnfs_request() at newnfs_request+0x9fb
> >> >>> nfscl_request() at nfscl_request+0x72
> >> >>> nfsrpc_lookup() at nfsrpc_lookup+0x1be
> >> >>> nfs_lookup() at nfs_lookup+0x297
> >> >>> VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x95
> >> >>> lookup() at lookup+0x3b8
> >> >>> namei() at namei+0x484
> >> >>> vn_open_cred() at vn_open_cred+0x1e2
> >> >>> link_elf_load_file() at link_elf_load_file+0xb3
> >> >>> linker_load_module() at linker_load_module+0x794
> >> >>> kern_kldload() at kern_kldload+0x145
> >> >>> sys_kldload() at sys_kldload+0x84
> >> >>> amd64_syscall() at amd64_syscall+0x39e
> >> >>> Xfast_syscall() at Xfast_syscall+0xf7
> >> >>
> >> >> You can safely ignore those.  Recursing on curvnet is harmless, but
> >> >> in certain cases can't be avoided.
> >> >>
> >> >> When injecting new CURVNET_SET() / CURVNET_RESTORE() points in the
> >> >> existing code, those warnings are here to help us becoming aware that
> >> >> we are setting curvnet in a function which was invoked with an
> >> >> already valid curvnet context.
> >> >>
> >> >> Marko




More information about the freebsd-virtualization mailing list