kern/93942: panic: ufs_dirbad: bad dir

Tue Feb 28 12:00:27 PST 2006

The following reply was made to PR kern/93942; it has been noted by GNATS.

From: Kris Kennaway <kris at obsecurity.org>
To: Yarema <yds at CoolRat.org>
Cc: FreeBSD-gnats-submit at FreeBSD.org,
	Dennis Koegel <amf at hobbit.neveragain.de>,
	Doug White <dwhite at gumbysoft.com>, Martin Machacek <m at m3a.net>
Subject: Re: kern/93942: panic: ufs_dirbad: bad dir
Date: Tue, 28 Feb 2006 14:53:43 -0500

 On Tue, Feb 28, 2006 at 10:35:36AM -0500, Yarema wrote:
 > 
 > >Number:         93942
 > >Category:       kern
 > >Synopsis:       panic: ufs_dirbad: bad dir
 > >Confidential:   no
 > >Severity:       critical
 > >Priority:       high
 > >Responsible:    freebsd-bugs
 > >State:          open
 > >Quarter:        
 > >Keywords:       
 > >Date-Required:
 > >Class:          sw-bug
 > >Submitter-Id:   current-users
 > >Arrival-Date:   Tue Feb 28 15:40:06 GMT 2006
 > >Closed-Date:
 > >Last-Modified:
 > >Originator:     Yarema <yds at CoolRat.org>
 > >Release:        FreeBSD 6.1-PRERELEASE i386
 > >Organization:
 > >Environment:
 > System: FreeBSD 6.1-PRERELEASE #0: Mon Feb 27 04:52:11 EST 2006 i386
 > 
 > >Description:
 > 
 > This is at least the third file system which got hosed for me by the
 > ufs_dirbad bug on three different hard drives since 5.3 STABLE.
 > I suspect this is related to the following PRs:
 > http://www.FreeBSD.org/cgi/query-pr.cgi?pr=49079
 > http://www.FreeBSD.org/cgi/query-pr.cgi?pr=51001
 > 
 > In every case a process would lock up making the whole system
 > unresponsive.  A reboot, fsck -y in single user mode and another
 > reboot would produce the following during the mount of the corrupt
 > fs in rw mode:
 > 
 > bad dir ino 2 at  offset 16384: mangled entry
 > panic: ufs_dirbad: bad dir
 > cpuid = 0
 > 
 > Another reboot, fsck -y in single user mode and reboot produces the
 > same results repeatedly.  Previously I had recovered by mounting the
 > corrupt fs in ro mode, backup, newfs, restore.
 > 
 > Recently I noticed Matthew Dillon commit the following to the
 > DragonFly src repository:
 > 
 > http://leaf.DragonFlyBSD.org/mailarchive/commits/2006-02/msg00057.html
 > 
 > dillon      2006/02/21 10:46:56 PST
 > 
 > DragonFly src repository
 > 
 >   Modified files:
 >     sys/kern             vfs_cluster.c 
 >   Log:
 >   bioops.io_start() was being called in a situation where the buffer could
 >   be brelse()'d afterwords instead of I/O being initiated.  When this occurs,
 >   the buffer may contain softupdates-modified data which is never reverted,
 >   resulting in serious filesystem corruption.  When io_start is called on a
 >   buffer, I/O MUST be initiated and terminated with a biodone() or the buffer's
 >   data may not be properly reverted.
 >   
 >   Solve the problem by moving the io_start() call a little further on in the
 >   code, after the potential brelse().
 >   
 >   There is a possibility that this bug is responsible for the 'dirbad' panics
 >   often reported in DragonFly and FreeBSD circles.
 >   
 >   Revision  Changes    Path
 >   1.16      +7 -6      src/sys/kern/vfs_cluster.c
 > 
 > http://www.DragonFlyBSD.org/cvsweb/src/sys/kern/vfs_cluster.c.diff?r1=1.15&r2=1.16&f=u
 > 
 > Below is the equivalent patch to the FreeBSD RELENG_6 branch of
 > src/sys/kern/vfs_cluster.c
 > 
 > Hope this helps track down the problem.

 Does it work for you? :)

 Kris