svn commit: r305129 - head/sys/vm

Wed Aug 31 16:39:50 UTC 2016

On Thu, Sep 01, 2016 at 01:38:31AM +1000, Bruce Evans wrote:
> On Wed, 31 Aug 2016, Konstantin Belousov wrote:
> 
> > Log:
> >  Make swapoff reliable.
> >
> >  The swap_pager_swapoff() function uses trylock for the object lock
> >  before pagein, which means that either i/o to md(4) over swap, or
> >  intensive page faults over swap pager objects might prevent swapoff()
> >  from making any progress. Then the retry < 100 check fails and machine
> >  panics.
> >
> >  If trylock fails, acquire the object lock in the blockable way and
> >  restart the hash bucket walk.  Keep retries logic for now.
> 
> Trylock is difficult to use.  When it fails, there is no way for the
> caller to tell how long it should wait before retrying (much like EAGAIN
> errors for userland).  Sometimes there is deadlock so retrying is worse
> than useless.  Sometimes trylock can detect deadlock, but callers never
> can (otherwise they wouldn't try).
It is actually a way for the caller to tell how long to wait. The lock
should be taken in blockable or sleepable way. Issue is that we have to
drop other locks, to avoid LOR, and restart the algorithm.

This approach, i.e. try fast to get lock with trylock, and do the slow
reset on failure with the required lock already held or not, is very
common pattern.  E.g. it is used in vm_fault() to handle LOR between
vnode lock and vm map lock, it is used in ufs_rename() to avoid LORs
between all vnode locks participating in rename, and in many other
places.

> 
> My version of mtx_trylock_spin_flags() has a timeout in usec.  It returns
> immediately if deadlock is detected.  The timeout is just for convenience
> in simple cases where the caller want to aquire the lock normally but
> doesn't want the unbounded timeout or panic given by
> mtx_trylock_spin().  The important thing is to return error codes like:
> - EDEADLK for deadlock detected (e.g., when the CPU running the thread
>    holding the lock is stopped)
> - EMAYBEDEADLK when deadlock is almost detected (e.g., when CPUs are being
>    stopped)
> - EAGAIN when no problem is detected but the lock is held
> - ERECURSE when the lock is held but acquiring it recursively would work
>    if the caller asked for that
> - ELOR if acquiring the lock would work but give a LOR, and the caller didn't
>    ask for LORs.
> 
> The current return value is positive logic for success and doesn't allow
> returning error codes.
Your use of trylock for console locks is very specific.  It is implied by
the intent of console output working in any context, which is not needed
for typical trylock usage.

> 
> In my applications in console drivers, the caller doesn't want the lock if
> it would give a LOR.  Without WITNESS, it is too hard to tell if the lock
> would give a LOR, but some errors are easy to detect.  E.g., acquiring
> sleep lock in a critical section.
Indeed.