svn commit: r282971 - in head/sys: kern sys

Wed May 20 17:41:09 UTC 2015

On Wed, 20 May 2015 09:54:45 -0700
Matthew Ahrens <matt at mahrens.org> wrote:

> On Wed, May 20, 2015 at 9:00 AM, Alexander Kabaev <kabaev at gmail.com>
> wrote:
> 
> > On Fri, 15 May 2015 13:50:38 +0000 (UTC)
> > John Baldwin <jhb at FreeBSD.org> wrote:
> >
> > > Author: jhb
> > > Date: Fri May 15 13:50:37 2015
> > > New Revision: 282971
> > > URL: https://svnweb.freebsd.org/changeset/base/282971
> > >
> > > Log:
> > >   Previously, cv_waiters was only updated by cv_signal or
> > > cv_wait. If a thread awakened due to a time out, then cv_waiters
> > > was not decremented. If INT_MAX threads timed out on a cv without
> > > an intervening cv_broadcast, then cv_waiters could overflow. To
> > > fix this, have each sleeping thread decrement cv_waiters when it
> > > resumes.
> > >
> > >   Note that previously cv_waiters was protected by the sleepq
> > > chain lock. However, that lock is not held when threads resume
> > > from sleep. In addition, the interlock is also not always
> > > reacquired after resuming (cv_wait_unlock), nor is it always held
> > > by callers of cv_signal() or cv_broadcast(). Instead, use atomic
> > > ops to update cv_waiters. Since the sleepq chain lock is still
> > > held on every increment, it should still be safe to compare
> > > cv_waiters against zero while holding the lock in the wakeup
> > > routines as the only way the race should be lost would result in
> > > extra calls to sleepq_signal() or sleepq_broadcast().
> > >   Differential Revision:      https://reviews.freebsd.org/D2427
> > >   Reviewed by:        benno
> > >   Reported by:        benno (wrap of cv_waiters in the field)
> > >   MFC after:  2 weeks
> > >
> > > Modified:
> > >   head/sys/kern/kern_condvar.c
> > >   head/sys/sys/condvar.h
> > >
> >
> > This breaks ZFS range locking code, which expects to be able to
> > wakeup everyone on the condition variable and then free the
> > structure that contains it. Having woken up threads modify
> > cv_waiters results in a race that leads to already freed memory to
> > be accessed.
> >
> > It is debatable just how correct ZFS code in its expectations, but I
> > think this commit should probably be reverted until either ZFS is
> > changed not to expect cv modifiable by waking threads or until
> > alternative solution is found to the cv_waiters overflow issue
> > fixed by this commit.
> >
> >
> It isn't clear to me how the zfs_range_unlock() code could know when
> all the waiters have woken up and updated the CV, and thus it's safe
> to destroy/free the CV.  Would the woken threads ask, "was I the last
> thread to be woken by this CV" and if so free the struct containing
> the CV? Obviously such a check would need to ensure that the other
> threads have completed their updates to the CV.
> 
> --matt

Assuming other threads _need_ to update cv after they have been woken
up. Clearly Solaris implementation managed to do without and our code
changed that breaking range locks implementation we took directly from
OpenSolaris (or illumos). What was previously possible now isn't. As I
wrote before, while merits of this expectations are debatable and it is
not hard to see where Solaris way is advantageous, that is really
besides the point. Are you arguing that we should leave kernel in known
broken state until 'proper' fix makes its way through possible upstream
detour?

Also, we have large code base taken from Solaris and chances are this is
not the only place that might be affected. I think we are better off
with this commit temporarily reverted until necessary repairs and
auditing are complete for it to be safely re-enabled.

-- 
Alexander Kabaev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/svn-src-all/attachments/20150520/f1d84cdc/attachment.sig>