cvs commit: src/sys/kern kern_proc.c

Wed Jun 9 16:18:26 GMT 2004

Warner wrote:
>Poul-Henning wrote:
>: Not to pick on anybody, but this is a perfect example of getting locking
>: almost right:
>:
>: BAD:
>:
>:       LOCK(foo->lock)
>:       foo->refcount--;
>:       UNLOCK(foo->lock)
>:       if (foo->refcount == 0)
>:               destroy(foo);
>:
>: GOOD:
>:
>:       LOCK(foo->lock)
>:       i = --foo->refcount;
>:       UNLOCK(foo->lock)
>:       if (i == 0)
>:               destroy(foo);
>:
>
>Can you provide a couple of lines about why BAD is BAD and why GOOD
>fixes that flaw?  That should help others from making this mistake in
>the future.
>
>Warner

  Frankly, I think it is obvious.

  In the BAD case, you decrement safely but you compare for the refcount
  having hit zero without the lock protecting the count being taken,
  so what happens is that there is a race window between where you drop
  the lock and check for the zero refcount where another racing thread
  could drop the refcount finally to zero and go into the check as
  well.  Briefly, this can happen:

  - refcount of 'foo' is 2.
  - thread 1 enters BAD code, lowers refcount to 1, releases lock.
  - thread 2 enters BAD code, lowers refcount to 0, releases lock.
  - thread 1 checks against refcount being zero, decides it is now,
    and proceeds to destroy(foo).
  - thread 2 checks against refcount being zero, decides it is now,
    and proceeds to destroy(foo) as well.

  Conclusion: foo is destroyed twice.

  The GOOD code does not suffer from this problem.  Here is a way to
  handle this sort of race if your reference counter is instead
  manipulated atomically (as opposed to protected by a mutex):
  [From Mbuf-related code]

    MEXT_REM_REF(m);  /* Atomic decrement of m->m_ext.ref_cnt */
    if (atomic_cmpset_int(m->m_ext.ref_cnt, 0, 1)) {
        /* Do the free here... */
    }
    return;

  -Bosko