About the memory barrier in BSD libc

Wed Apr 25 07:05:55 UTC 2012

On Wed, Apr 25, 2012 at 2:26 PM, Konstantin Belousov
<kostikbel at gmail.com> wrote:
> On Wed, Apr 25, 2012 at 11:25:40AM +0800, Fengwei yin wrote:
>> On Wed, Apr 25, 2012 at 2:13 AM, Konstantin Belousov
>> <kostikbel at gmail.com> wrote:
>> > On Tue, Apr 24, 2012 at 07:00:33PM +0100, Martin Simmons wrote:
>> >> >>>>> On Tue, 24 Apr 2012 19:58:42 +0300, Konstantin Belousov said:
>> >> >
>> >> > +
>> >> > +   /*
>> >> > +    * Lock the spinlock used to protect __sglue list walk in
>> >> > +    * __sfp().  The __sfp() uses fp->_flags == 0 test as an
>> >> > +    * indication of the unused FILE.
>> >> > +    *
>> >> > +    * Taking the lock prevents possible compiler or processor
>> >> > +    * reordering of the writes performed before the final _flags
>> >> > +    * cleanup, making sure that we are done with the FILE before
>> >> > +    * it is considered available.
>> >> > +    */
>> >> > +   STDIO_THREAD_LOCK();
>> >> >     fp->_flags = 0;         /* Release this FILE for reuse. */
>> >> > +   STDIO_THREAD_UNLOCK();
>> >> >     FUNLOCKFILE(fp);
>> >> >     return (r);
>> >>
>> >> Is that assumption about reordering correct?  I.e. is FreeBSD's spinlock a
>> >> two-way barrier?
>> >>
>> >> It isn't unusual for the locking primitive to be a one-way barrier, i.e. (from
>> >> Linux kernel memory-barriers.txt) "Memory operations that occur before a LOCK
>> >> operation may appear to happen after it completes."  See also acq and rel in
>> >> atomic(9).
>> > Taking the lock prevents the __sfp from iterating the list until the
>> > spinlock is released. Since release makes sure that all previous memory
>> > operations become visible, the acquire in the spinlock lock provides
>> > the neccesary guarentee.
>>
>> IMHO, the lock to me is too heavy here. What about this patch?
>>
>> NOTE: patch just show thoughts. I didn't even check build checking.
> Yes, it might be correct. But FreeBSD does prefer the acq/rel barriers
> over the rmb/wmb.
>

There is no stand alone acq/rel APIs (Sorry, as new guy to FreeBSD,
don't know too much APIs).  They are bound to atomic operations.
And yes, atomic_acq/rel should work also.

> Also, the lock is not that heavy right there, and the committed patch
> provides mostly zero overhead for non-threaded case.

But lock could introduced contention in SMP case which could be avoid
with rmb/wmb.

I wonder whether the rmb/wmb could be defined as a compiler barrier on
non-threaded case. Like STDIO_THREAD_LOCK which has version
for thread/non-thread case.

>>
>> diff --git a/lib/libc/stdio/fclose.c b/lib/libc/stdio/fclose.c
>> index f0629e8..a26f944 100644
>> --- a/lib/libc/stdio/fclose.c
>> +++ b/lib/libc/stdio/fclose.c
>> @@ -38,6 +38,7 @@ __FBSDID("$FreeBSD$");
>>
>>  #include "namespace.h"
>>  #include <errno.h>
>> +#include <machine/atomic.h>
>>  #include <stdio.h>
>>  #include <stdlib.h>
>>  #include "un-namespace.h"
>> @@ -65,6 +66,7 @@ fclose(FILE *fp)
>>               FREELB(fp);
>>       fp->_file = -1;
>>       fp->_r = fp->_w = 0;    /* Mess up if reaccessed. */
>> +     wmb();
>>       fp->_flags = 0;         /* Release this FILE for reuse. */
>>       FUNLOCKFILE(fp);
>>       return (r);
>> diff --git a/lib/libc/stdio/findfp.c b/lib/libc/stdio/findfp.c
>> index 89c0536..03b2945 100644
>> --- a/lib/libc/stdio/findfp.c
>> +++ b/lib/libc/stdio/findfp.c
>> @@ -129,9 +129,16 @@ __sfp()
>>        */
>>       THREAD_LOCK();
>>       for (g = &__sglue; g != NULL; g = g->next) {
>> -             for (fp = g->iobs, n = g->niobs; --n >= 0; fp++)
>> -                     if (fp->_flags == 0)
>> +             for (fp = g->iobs, n = g->niobs; --n >= 0; fp++) {
>> +                     int __flags = fp->_flags;
>> +                     rmb();
>> +                     /*
>> +                      * If could see __flags is zero here, we are sure
>> +                      * the cleanup in fclose is done.
>> +                      */
>> +                     if (__flags == 0)
>>                               goto found;
>> +             }
>>       }
>>       THREAD_UNLOCK();        /* don't hold lock while malloc()ing. */
>>       if ((g = moreglue(NDYNAMIC)) == NULL)