Re: git: 589aed00e36c - main - sched: separate out schedinit_ap()

From: Mark Johnston <markj_at_freebsd.org>
Date: Wed, 17 Nov 2021 23:25:55 UTC
On Thu, Nov 18, 2021 at 01:10:17AM +0200, Konstantin Belousov wrote:
> On Wed, Nov 17, 2021 at 04:44:29PM -0600, Kyle Evans wrote:
> > On Wed, Nov 3, 2021 at 3:55 PM Kyle Evans <kevans@freebsd.org> wrote:
> > >
> > > The branch main has been updated by kevans:
> > >
> > > URL: https://cgit.FreeBSD.org/src/commit/?id=589aed00e36c22733d3fd9c9016deccf074830b1
> > >
> > > commit 589aed00e36c22733d3fd9c9016deccf074830b1
> > > Author:     Kyle Evans <kevans@FreeBSD.org>
> > > AuthorDate: 2021-11-02 18:06:47 +0000
> > > Commit:     Kyle Evans <kevans@FreeBSD.org>
> > > CommitDate: 2021-11-03 20:54:59 +0000
> > >
> > >     sched: separate out schedinit_ap()
> > >
> > >     schedinit_ap() sets up an AP for a later call to sched_throw(NULL).
> > >
> > >     Currently, ULE sets up some pcpu bits and fixes the idlethread lock with
> > >     a call to sched_throw(NULL); this results in a window where curthread is
> > >     setup in platforms' init_secondary(), but it has the wrong td_lock.
> > >     Typical platform AP startup procedure looks something like:
> > >
> > >     - Setup curthread
> > >     - ... other stuff, including cpu_initclocks_ap()
> > >     - Signal smp_started
> > >     - sched_throw(NULL) to enter the scheduler
> > >
> > >     cpu_initclocks_ap() may have callouts to process (e.g., nvme) and
> > >     attempt to sched_add() for this AP, but this attempt fails because
> > >     of the noted violated assumption leading to locking heartburn in
> > >     sched_setpreempt().
> > >
> > >     Interrupts are still disabled until cpu_throw() so we're not really at
> > >     risk of being preempted -- just let the scheduler in on it a little
> > >     earlier as part of setting up curthread.
> > >
> > 
> > What's the general consensus on potential out-of-tree archs maintained
> > on stable branches? I'd like to MFC this at least to stable/13 to
> > avoid it being in the way of the nvme change that spurred it, and I'm
> > trying to decide if it should have something like this added to make
> > it safe:
> I do not believe that we even think of guaranteeing this level of source
> stability.

At first I assumed this was referencing sparc64, but that is not present
in stable/13 either.  I believe stable/13 and main support the same set
of platforms, in which case I agree that we shouldn't bother with trying
to provide extra compatibility, and I think it's probably not necessary
to merge this to 12.

> > 
> > diff --git a/sys/kern/sched_ule.c b/sys/kern/sched_ule.c
> > index 217d685b8587..f07f5e91d8f3 100644
> > --- a/sys/kern/sched_ule.c
> > +++ b/sys/kern/sched_ule.c
> > @@ -2995,6 +2995,11 @@ sched_throw(struct thread *td)
> > 
> >         tdq = TDQ_SELF();
> >         if (__predict_false(td == NULL)) {
> > +               if (tdq == NULL || PCPU_GET(idlethread)->td_lock !=
> > +                   TDQ_LOCKPTR(tdq)) {
> > +                       schedinit_ap();
> > +                       tdq = TDQ_SELF();
> > +               }
> >                 TDQ_LOCK(tdq);
> >                 /* Correct spinlock nesting. */
> >                 spinlock_exit();
> > 
>