kern/93942: panic: ufs_dirbad: bad dir
Yarema
yds at CoolRat.org
Tue Feb 28 07:40:12 PST 2006
>Number: 93942
>Category: kern
>Synopsis: panic: ufs_dirbad: bad dir
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Tue Feb 28 15:40:06 GMT 2006
>Closed-Date:
>Last-Modified:
>Originator: Yarema <yds at CoolRat.org>
>Release: FreeBSD 6.1-PRERELEASE i386
>Organization:
>Environment:
System: FreeBSD 6.1-PRERELEASE #0: Mon Feb 27 04:52:11 EST 2006 i386
>Description:
This is at least the third file system which got hosed for me by the
ufs_dirbad bug on three different hard drives since 5.3 STABLE.
I suspect this is related to the following PRs:
http://www.FreeBSD.org/cgi/query-pr.cgi?pr=49079
http://www.FreeBSD.org/cgi/query-pr.cgi?pr=51001
In every case a process would lock up making the whole system
unresponsive. A reboot, fsck -y in single user mode and another
reboot would produce the following during the mount of the corrupt
fs in rw mode:
bad dir ino 2 at offset 16384: mangled entry
panic: ufs_dirbad: bad dir
cpuid = 0
Another reboot, fsck -y in single user mode and reboot produces the
same results repeatedly. Previously I had recovered by mounting the
corrupt fs in ro mode, backup, newfs, restore.
Recently I noticed Matthew Dillon commit the following to the
DragonFly src repository:
http://leaf.DragonFlyBSD.org/mailarchive/commits/2006-02/msg00057.html
dillon 2006/02/21 10:46:56 PST
DragonFly src repository
Modified files:
sys/kern vfs_cluster.c
Log:
bioops.io_start() was being called in a situation where the buffer could
be brelse()'d afterwords instead of I/O being initiated. When this occurs,
the buffer may contain softupdates-modified data which is never reverted,
resulting in serious filesystem corruption. When io_start is called on a
buffer, I/O MUST be initiated and terminated with a biodone() or the buffer's
data may not be properly reverted.
Solve the problem by moving the io_start() call a little further on in the
code, after the potential brelse().
There is a possibility that this bug is responsible for the 'dirbad' panics
often reported in DragonFly and FreeBSD circles.
Revision Changes Path
1.16 +7 -6 src/sys/kern/vfs_cluster.c
http://www.DragonFlyBSD.org/cvsweb/src/sys/kern/vfs_cluster.c.diff?r1=1.15&r2=1.16&f=u
Below is the equivalent patch to the FreeBSD RELENG_6 branch of
src/sys/kern/vfs_cluster.c
Hope this helps track down the problem.
>How-To-Repeat:
mount <corrupt ufs>
>Fix:
--- src/sys/kern/vfs_cluster.c.orig Fri Oct 28 03:28:27 2005
+++ src/sys/kern/vfs_cluster.c Tue Feb 28 09:27:20 2006
@@ -881,11 +881,6 @@
bremfree(tbp);
tbp->b_flags &= ~B_DONE;
} /* end of code for non-first buffers only */
- /* check for latent dependencies to be handled */
- if ((LIST_FIRST(&tbp->b_dep)) != NULL) {
- tbp->b_iocmd = BIO_WRITE;
- buf_start(tbp);
- }
/*
* If the IO is via the VM then we do some
* special VM hackery (yuck). Since the buffer's
@@ -933,6 +928,11 @@
BUF_KERNPROC(tbp);
TAILQ_INSERT_TAIL(&bp->b_cluster.cluster_head,
tbp, b_cluster.cluster_entry);
+ /* check for latent dependencies to be handled */
+ if ((LIST_FIRST(&tbp->b_dep)) != NULL) {
+ tbp->b_iocmd = BIO_WRITE;
+ buf_start(tbp);
+ }
}
finishcluster:
pmap_qenter(trunc_page((vm_offset_t) bp->b_data),
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list