Re: git: 2e72208b6c62 - main - ip_mroute: do not call epoch_waitwhen lock is taken

From: Wojciech Macek <wma_at_semihalf.com>
Date: Thu, 13 Jan 2022 15:27:23 UTC
Thanks, i'll take a look if moving mrouter_done outside the lock would work.

Wojtek

śr., 12 sty 2022, 22:40 użytkownik Mark Johnston <markj@freebsd.org>
napisał:

> On Tue, Jan 11, 2022 at 10:19:45AM +0000, Wojciech Macek wrote:
> > The branch main has been updated by wma:
> >
> > URL:
> https://cgit.FreeBSD.org/src/commit/?id=2e72208b6c622505323ed48dc58830fc307392b1
> >
> > commit 2e72208b6c622505323ed48dc58830fc307392b1
> > Author:     Wojciech Macek <wma@FreeBSD.org>
> > AuthorDate: 2022-01-11 10:08:35 +0000
> > Commit:     Wojciech Macek <wma@FreeBSD.org>
> > CommitDate: 2022-01-11 10:19:32 +0000
> >
> >     ip_mroute: do not call epoch_waitwhen lock is taken
> >
> >     mrouter_done is called with RAW IP lock taken. Some annoying
> >     printfs are visible on the console if INVARIANTS option is enabled.
> >
> >     Provide atomic-based mechanism which counts enters and exits from/to
> >     critical section in ip_input and ip_output.
> >     Before de-initialization of function pointers ensure (with busy-wait)
> >     that mrouter de-initialization is visible to all readers and that we
> don't
> >     remove pointers (like ip_mforward etc.) in the middle of packet
> processing.
>
> This doesn't address the problem that the warning is complaining about,
> which is that a non-sleepable lock is held while performing an expensive
> operation.  Now we are silently spinning instead, and there's no
> guarantee of forward progress since threads may be frequently entering
> the MROUTER_RLOCK-protected section in ip_input().  And the change
> converts a per-CPU atomic operation (in epoch_enter()) to a global
> atomic operation, potentially hurting performance.  The atomics are also
> missing requisite barriers.
>
> I think a better solution is to not hold the raw inpcb lock while
> calling the ip_mrouter_done() callback.  That lock does not provide
> synchronization for anything related to ip_mroute.c.  Also note that
> X_ip_mrouter_done() can sleep in taskqueue_drain(), which is another
> reason to avoid holding the lock there.
>
> > ---
> >  sys/netinet/ip_mroute.h | 9 +++++----
> >  sys/netinet/raw_ip.c    | 1 +
> >  2 files changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/sys/netinet/ip_mroute.h b/sys/netinet/ip_mroute.h
> > index 65c5bdd3a025..faf5b8c72a46 100644
> > --- a/sys/netinet/ip_mroute.h
> > +++ b/sys/netinet/ip_mroute.h
> > @@ -365,11 +365,12 @@ extern int      (*ip_mrouter_set)(struct socket *,
> struct sockopt *);
> >  extern int   (*ip_mrouter_get)(struct socket *, struct sockopt *);
> >  extern int   (*ip_mrouter_done)(void);
> >  extern int   (*mrt_ioctl)(u_long, caddr_t, int);
> > +extern int   ip_mrouter_critical_section_cnt;
> >
> > -#define      MROUTER_RLOCK_TRACKER   struct epoch_tracker mrouter_et
> > -#define      MROUTER_RLOCK() epoch_enter_preempt(net_epoch_preempt,
> &mrouter_et)
> > -#define      MROUTER_RUNLOCK()
>  epoch_exit_preempt(net_epoch_preempt, &mrouter_et)
> > -#define      MROUTER_WAIT()  epoch_wait_preempt(net_epoch_preempt)
> > +#define      MROUTER_RLOCK_TRACKER
>
> Why keep this definition at all?
>
> > +#define      MROUTER_RLOCK()
>  atomic_add_int(&ip_mrouter_critical_section_cnt, 1)
> > +#define      MROUTER_RUNLOCK()
>  atomic_subtract_int(&ip_mrouter_critical_section_cnt, 1)
> > +#define      MROUTER_WAIT()          do {} while
> (atomic_load_int(&ip_mrouter_critical_section_cnt) != 0)
> >
> >  #endif /* _KERNEL */
> >
> > diff --git a/sys/netinet/raw_ip.c b/sys/netinet/raw_ip.c
> > index 7c495745806e..dc49c36f25ad 100644
> > --- a/sys/netinet/raw_ip.c
> > +++ b/sys/netinet/raw_ip.c
> > @@ -120,6 +120,7 @@ VNET_DEFINE(struct socket *, ip_mrouter);
> >  int (*ip_mrouter_set)(struct socket *, struct sockopt *);
> >  int (*ip_mrouter_get)(struct socket *, struct sockopt *);
> >  int (*ip_mrouter_done)(void);
> > +int ip_mrouter_critical_section_cnt;
> >  int (*ip_mforward)(struct ip *, struct ifnet *, struct mbuf *,
> >                  struct ip_moptions *);
> >  int (*mrt_ioctl)(u_long, caddr_t, int);
>