Questions with a powerpc64/powerpc context: relaxed use of smp_cpus in umtx_busy vs. relaxed updates to smp_cpus in machine dependent code?

Mark Millard marklmi at yahoo.com
Wed Feb 13 20:33:58 UTC 2019


Why I ask the questions below (after providing context):
There are boot issues on old multi-processor PowerMac G5s that
frequently hang up during cpu_mp_unleash --but not always.


/usr/src/sys/kern/kern_umtx.c has the following code
(note the smp_cpus use in the machine-independent code):


static inline void
umtxq_busy(struct umtx_key *key)
{
        struct umtxq_chain *uc;
 
        uc = umtxq_getchain(key);
        mtx_assert(&uc->uc_lock, MA_OWNED);
        if (uc->uc_busy) {
#ifdef SMP
                if (smp_cpus > 1) {
                        int count = BUSY_SPINS;
                        if (count > 0) {
                                umtxq_unlock(key);
                                while (uc->uc_busy && --count > 0)
                                        cpu_spinwait();
                                umtxq_lock(key);
                        }
                }
#endif
                while (uc->uc_busy) {
                        uc->uc_waiters++;
                        msleep(uc, &uc->uc_lock, 0, "umtxqb", 0);
                        uc->uc_waiters--;
                }
        }
        uc->uc_busy = 1;
}

The use of smp_cpus here on powerpc would be what is called
a std::memory_order_relaxed load in c++ terms. smp_cpus
does change during the machine dependent-code cpu_mp_unleash
in /usr/src/sys/powerpc/powerpc/mp_machdep.c :

static void
cpu_mp_unleash(void *dummy)
{
. . .
        smp_cpus = 0;
. . .
        STAILQ_FOREACH(pc, &cpuhead, pc_allcpu) {
. . .
               if (pc->pc_awake) {
                        if (bootverbose)
                                printf("Adding CPU %d, hwref=%jx, awake=%x\n",
                                    pc->pc_cpuid, (uintmax_t)pc->pc_hwref,
                                    pc->pc_awake);
                        smp_cpus++;
                } else
. . . 
        }

which are relaxed stores.

[This dos not appear to be a std::memory_order_consume like
context (no dependency ordered before usage).]

/usr/src/sys/kern/subr_smp.c does initialize smp_cpus to 1
in its definition. (But it temporarily reverts to zero in
the above code.)

So far I've not managed to track down examples of specific
code (in an objdump of the kernel, say) that matches up
using some form(s) of the following to control access
order in the various places umtxq_busy is used:

lwsync (acquire/release/AcqRel fence or store-release [with load-acquire code as well])
or:
sync (a.k.a. hwsync and sync 0) (sequentially consistent fence/store/load)

Note: smp_cpus is not even volatile so, potentially, for a time a register
could be all that holds the sequence of smp_cpus values before memory is
updated later.

Nor have I yet found the earliest use of the umtxq_busy code. If it is
late enough after cpu_mp_unleash, that might implicitly provide something
that is not a local code structure.

Can anyone point me to example(s) of what controls umtxq_busy necessarily
accessing the intended smp_cpus value?

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)



More information about the freebsd-hackers mailing list