About the memory barrier in BSD libc

Konstantin Belousov kostikbel at gmail.com
Mon Apr 23 13:03:58 UTC 2012


On Mon, Apr 23, 2012 at 08:33:05PM +0800, Fengwei yin wrote:
> On Mon, Apr 23, 2012 at 8:07 PM, Konstantin Belousov
> <kostikbel at gmail.com> wrote:
> > On Mon, Apr 23, 2012 at 07:44:34PM +0800, Fengwei yin wrote:
> >> On Mon, Apr 23, 2012 at 7:38 PM, Slawa Olhovchenkov <slw at zxy.spb.ru> wrote:
> >> > On Mon, Apr 23, 2012 at 07:26:54PM +0800, Fengwei yin wrote:
> >> >
> >> >> On Mon, Apr 23, 2012 at 5:40 PM, Slawa Olhovchenkov <slw at zxy.spb.ru> wrote:
> >> >> > On Mon, Apr 23, 2012 at 05:32:24PM +0800, Fengwei yin wrote:
> >> >> >
> >> >> >> On Mon, Apr 23, 2012 at 4:41 PM, Slawa Olhovchenkov <slw at zxy.spb.ru> wrote:
> >> >> >> > On Mon, Apr 23, 2012 at 02:56:03PM +0800, Fengwei yin wrote:
> >> >> >> >
> >> >> >> >> Hi list,
> >> >> >> >> If this is not correct question on the list, please let me know and
> >> >> >> >> sorry for noise.
> >> >> >> >>
> >> >> >> >> I have a question regarding the BSD libc for SMP arch. I didn't see
> >> >> >> >> memory barrier used in libc.
> >> >> >> >> How can we make sure it's safe on SMP arch?
> >> >> >> >
> >> >> >> > /usr/include/machine/atomic.h:
> >> >> >> >
> >> >> >> > #define mb()    __asm __volatile("lock; addl $0,(%%esp)" : : : "memory")
> >> >> >> > #define wmb()   __asm __volatile("lock; addl $0,(%%esp)" : : : "memory")
> >> >> >> > #define rmb()   __asm __volatile("lock; addl $0,(%%esp)" : : : "memory")
> >> >> >> >
> >> >> >>
> >> >> >> Thanks for the information. But it looks no body use it in libc.
> >> >> >
> >> >> > I think no body in libc need memory barrier: libc don't work with
> >> >> > peripheral, for atomic opertions used different macros.
> >> >>
> >> >> If we check the usage of __sinit(), it is a typical singleton pattern which
> >> >> needs memory barrier to make sure no potential SMP issue.
> >> >>
> >> >> Or did I miss something here?
> >> >
> >> > What architecture with cache incoherency and FreeBSD support?
> >>
> >> I suppose it's not related with cache inchoherency (I could be wrong).
> >> It's related
> >> with reorder of instruction by CPU.
> >>
> >> Here is the link talking about why need memory barrier for singleton:
> >> http://www.oaklib.org/docs/oak/singleton.html
> >>
> >> x86 has strict memory model and may not suffer this kind of issue. But
> >> ARM need to
> >> take care of it IMHO.
> >
> > Please note that __sinit is idempotent, so double-initialization is not
> > an issue there. The only possible problematic case would be other thread
> > executing exit and not noticing non-NULL value for __cleanup while current
> > thread just set it.
> >
> > I am not sure how much real this race is. Each call to _sinit() is immediately
> > followed by a lock acquire, typically FLOCKFILE(), which enforces full barrier
> > semantic due to pthread_mutex_lock call. The exit() performs __cxa_finalize()
> > call before checking __cleanup value, and __cxa_finalize() itself locks
> > atexit_mutex. So the race is tiny and probably possible only for somewhat
> > buggy applications which perform exit() while there are stdio operations
> > in progress.
> >
> > Also note that some functions assign to __cleanup unconditionally.
> >
> > Do you see any real issue due to non-synchronized access to __cleanup ?
> 
> No. I didn't see real issue. I am just reviewing the code.
> 
> If you don't think __sinit has issue, let's check another code:
>      line 68 in libc/stdio/fclose.c
>      line 133 in libc/stdio/findfp.c (function __sfp())
> 
> Which is trying to free a fp slot by assign 0 to fp->_flags. But if
> the instrucation
> could be re-ordered, another CPU could see fp->_flags is assigned to 0
> before the
> cleanup from line 57 to 67.
> 
> Let's say, if another CPU is in line 133 of __sfp(), it could see
> fp->_flags become
> 0 before it's aware of the cleanup (Line 57 to line 67 in
> libc/stdio/fclose.c) happen.
> 
> Note: the mutex of FUNLOCKFILE(fp) in line 69 of libc/stdio/fclose.c
> just could make sure
> line 70 happen after line 68. It can't impact the re-order of line 57
> ~ line 68 by CPU.

Yes, FUNLOCKFILE() there would have no effect on the potential CPU reordering
of the writes.  But does the order of these writes matter at all ?

Please note that __sfp() reinitializes all fields written by fclose().
Only if CPU executing fclose() is allowed to reorder operations so that
the external effect of _flags = 0 assignment can be observed before that
CPU executes other operations from fclose(), there could be a problem.

This is definitely impossible on Intel, and I indeed do not know about
other architectures enough to reject such possibility. The _flags member
is short, so atomics cannot be used there. The easier solution, if this
is indeed an issue, is to lock thread_lock around _flags = 0 assignment
in fclose().
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-threads/attachments/20120423/da71f454/attachment.pgp


More information about the freebsd-threads mailing list