svn commit: r348810 - head/sys/x86/x86

Sat Jun 8 19:58:27 UTC 2019

Hi!

> On Jun 8, 2019, at 11:26, Jonathan T. Looney <jtl at freebsd.org> wrote:
> 
> Author: jtl
> Date: Sat Jun  8 18:26:48 2019
> New Revision: 348810
> URL: https://svnweb.freebsd.org/changeset/base/348810
> 
> Log:
>  Currently, MCA entries remain on an every-growing linked list. This means
>  that it becomes increasingly expensive to process a steady stream of
>  correctable errors. Additionally, the memory used by the MCA entries can
>  grow without bound.
> 
>  Change the code to maintain two separate lists: a list of entries which
>  still need to be logged, and a list of entries which have already been
>  logged. Additionally, allow a user-configurable limit on the number of
>  entries which will be saved after they are logged. (The limit defaults
>  to -1 [unlimited], which is the current behavior.)
> 
>  Reviewed by:    imp, jhb
>  MFC after:    2 weeks
>  Sponsored by:    Netflix
>  Differential Revision:    https://reviews.freebsd.org/D20482

Briefly looking through the code, I was wondering if the type/locking for mca_freecount (before and after this commit) was correct, given that it seems like it can be modified in multiple call sites and in different tasks.

Thank you!
-Enji

> Modified:
>  head/sys/x86/x86/mca.c
> 
> Modified: head/sys/x86/x86/mca.c
> ==============================================================================
> --- head/sys/x86/x86/mca.c    Sat Jun  8 17:49:17 2019    (r348809)
> +++ head/sys/x86/x86/mca.c    Sat Jun  8 18:26:48 2019    (r348810)
> @@ -86,7 +86,6 @@ struct amd_et_state {
> 
> struct mca_internal {
>    struct mca_record rec;
> -    int        logged;
>    STAILQ_ENTRY(mca_internal) link;
> };
> 
> @@ -101,6 +100,7 @@ static MALLOC_DEFINE(M_MCA, "MCA", "Machine Check Arch
> 
> static volatile int mca_count;    /* Number of records stored. */
> static int mca_banks;        /* Number of per-CPU register banks. */
> +static int mca_maxcount = -1;    /* Limit on records stored. (-1 = unlimited) */
> 
> static SYSCTL_NODE(_hw, OID_AUTO, mca, CTLFLAG_RD, NULL,
>     "Machine Check Architecture");
> @@ -125,10 +125,11 @@ SYSCTL_INT(_hw_mca, OID_AUTO, erratum383, CTLFLAG_RDTU
> static STAILQ_HEAD(, mca_internal) mca_freelist;
> static int mca_freecount;
> static STAILQ_HEAD(, mca_internal) mca_records;
> +static STAILQ_HEAD(, mca_internal) mca_pending;
> static struct callout mca_timer;
> static int mca_ticks = 3600;    /* Check hourly by default. */
> static struct taskqueue *mca_tq;
> -static struct task mca_refill_task, mca_scan_task;
> +static struct task mca_resize_task, mca_scan_task;
> static struct mtx mca_lock;
> 
> static unsigned int
> @@ -557,32 +558,49 @@ mca_check_status(int bank, struct mca_record *rec)
> }
> 
> static void
> -mca_fill_freelist(void)
> +mca_resize_freelist(void)
> {
> -    struct mca_internal *rec;
> -    int desired;
> +    struct mca_internal *next, *rec;
> +    STAILQ_HEAD(, mca_internal) tmplist;
> +    int count, i, desired_max, desired_min;
> 
>    /*
>     * Ensure we have at least one record for each bank and one
> -     * record per CPU.
> +     * record per CPU, but no more than twice that amount.
>     */
> -    desired = imax(mp_ncpus, mca_banks);
> +    desired_min = imax(mp_ncpus, mca_banks);
> +    desired_max = imax(mp_ncpus, mca_banks) * 2;
> +    STAILQ_INIT(&tmplist);
>    mtx_lock_spin(&mca_lock);
> -    while (mca_freecount < desired) {
> +    while (mca_freecount > desired_max) {
> +        rec = STAILQ_FIRST(&mca_freelist);
> +        KASSERT(rec != NULL, ("mca_freecount is %d, but list is empty",
> +            mca_freecount));
> +        STAILQ_REMOVE_HEAD(&mca_freelist, link);
> +        mca_freecount--;
> +        STAILQ_INSERT_TAIL(&tmplist, rec, link);
> +    }
> +    while (mca_freecount < desired_min) {
> +        count = desired_min - mca_freecount;
>        mtx_unlock_spin(&mca_lock);

Should this also be outside the loop, like it was before, to ensure the locking is correct, the lock is always unlocked, and the current thread yields to the other threads, or should the lock be held over both loops?

> -        rec = malloc(sizeof(*rec), M_MCA, M_WAITOK);
> +        for (i = 0; i < count; i++) {
> +            rec = malloc(sizeof(*rec), M_MCA, M_WAITOK);
> +            STAILQ_INSERT_TAIL(&tmplist, rec, link);
> +        }
>        mtx_lock_spin(&mca_lock);
> -        STAILQ_INSERT_TAIL(&mca_freelist, rec, link);
> -        mca_freecount++;
> +        STAILQ_CONCAT(&mca_freelist, &tmplist);
> +        mca_freecount += count;
>    }
>    mtx_unlock_spin(&mca_lock);
> +    STAILQ_FOREACH_SAFE(rec, &tmplist, link, next)
> +        free(rec, M_MCA);
> }

Thanks!
-Enji