[PANIC] ufs_dirbad: bad dir

Sun Dec 4 20:41:12 PST 2005

    I just had a thought on the dirbad panics (which we are also getting
    occassionally).   These are the symptoms I see:

    * Seems to be related to softupdates

    * In all the panics I've seen, what I have *NOT* seen has been any 
      panics related to bitmap corruption.  I added a lot of extra checks
      in both the bitmap allocation and free code and not one of them has
      been hit.

    * I have not seen any vm_page corruption.  e.g. no ref count panics.

    * The dirbad panics occurs infrequently, implying a narrow window where
      it occucrs.

    * The dirbad panics seem to be related to extraction of a tar, or
      filesystem synchronization.  If we assume the file extension is not
      the problem (unless reallocblks or fragextend is the problem), then
      it could be related to truncation.

    * In all core dumps I've seen from DragonFly systems exhibiting the 
      failure, the directory block in question has universally appeared to
      contain data from a fail.  In the last dump I looked at the first 1K
      of an 8K directory block (well beyond the direct block array in the
      inode) contained data from a file while the remaining 7K had directory
      data in it.
      
    There are two scenarios that I can think of that can have these effects
    yet not trigger any of the bitmap corruption tests/panics.

    (1) If ffs_blkfree() is called on a block who's related buffer still 
    exists and is or might become dirty, then a totally UNRELATED vnode 
    can immediately reallocate the physical block and issue its own I/O
    operation.  So we wind up with two buffer cache buffers with different
    vnodes and lblkno's but the same blkno, at least temporarily.  The
    condition would only need to hold long enough for the writes to be
    ordered the wrong way and poof, the new vnode winds up with someone
    else's data in it.

    (2) If getblk() is called on (blkdev, blkno) (which softupdates does)
    and this somehow survives (vp, lblkno) and gets written back after
    the bitmap related to (vp, lblkno) has been freed, the data might now
    be owned by a different vnode and we'd be corrupting another vnode's
    data.  

    The question is... can such a scenario occur anywhere in the UFS code?
    In particular with softupdates enabled? 

    I did see one possible issue, and that is the fact that 
    softdep_setup_freeblocks() is called BEFORE vinvalbuf() in ffs_truncate().
    If softdeps somehow is able to execute *ANY* of those workitems before 
    the vinvalbuf runs, in particular if there is already I/O in progress
    on any of those buffers (and thus doesn't get invalidated by vinvalbuf
    until after the I/O is complete), then that could result in the above
    scenario.  The case occurs in several situations but the main one that
    DragonFly and FreeBSD shares is when a file is truncated to 0-length.

						-Matt