Question about panic in brelse()
Christoph Mallon
christoph.mallon at gmx.de
Tue Jan 20 02:43:27 PST 2009
Sadly I got no response about the panic problem so far. I investigated
further and I came to the conclusion, that there are at least two
problems/bugs in brelse(). Here are my findings. All lines refer to
sys/kern/vfs_bio.c at r183754, which is the most recent version of this
file.
Below is the KASSERT(), which triggers in bundirty() (which is called by
brelse()):
KASSERT(bp->b_flags & B_REMFREE || bp->b_qindex == QUEUE_NONE,
("bundirty: buffer %p still on queue %d", bp, bp->b_qindex));
So this assertion triggers, if neither B_REMFREE is set nor the buffer
is in QUEUE_NONE.
Now lets have a look at brelse() before the call to bundirty() happens.
In line 1325 we find:
if (bp->b_flags & B_REMFREE)
bremfreel(bp);
In bremfreel() B_REMFREE is cleared (line 684). So after this B_REMFREE
is guaranteed to be clear and so the first part of the triggered
KASSERT() will be false.
Now for the second part, i.e. QUEUE_NONE: At line 1331 starts an
if-cascade. No matter which case of this cascade is entered, b_qindex is
set and it especially in no case it is set to QUEUE_NONE. so after this
if-cascade it is guaranteed that b_qindex is *not* QUEUE_NONE.
So when we arrive at line 1373ff at the problematic call to bundirty(),
which contains the KASSERT(), it is guaranteed that neither B_REMFREE is
set nor the buffer is in QUEUE_NONE and therefore the KASSERT() will be
triggered. So we have a serious problem here: When the call is done, it
*will* fail.
This is one part of the problem. I have no solution, I don't know what
would be correct here. Further, I looked at r93707 when this call to
bundirty() was added. The situation was basically the same back then,
i.e. the buffer could not be in QUEUE_NONE (the check for B_REMFREE did
not exist back then, so the KASSERT() was more strict). So this change
seems to wrong all along or at least dubious. The commit log of r93707
on the other hand claims that it solved a serious problem. If this was
tested back then, it must have been done without INVARIANTS. It seems to
be, that it was a hot fix - mind the MFC time of 1 day.
Now for the second problem. As stated in the original mail, there is a
buffer for which a write attempt failed:
> b_iocmd = BIO_WRITE
> b_ioflags = BIO_ERROR
> b_error = EIO
> b_flags = B_NOCACHE
Because the error ist "just" an EIO, the buffer is re-dirtied and
BIO_ERROR is cleared (line 1144ff). So writing the buffer should be
tried again.
But later in line 1342ff it is determined that the buffer contains junk,
because B_NOCACHE is set so B_INVAL is also set. This leads to entering
the path in line 1373, which then triggers the KASSERT().
On the other hand, the buffer does not contain junk, because it should
be retried to be written! It only should be not kept around after the
write was successful (or is considered to be failed permamently). So I
think testing B_NOCACHE here alone is to weak and the condition has to
be modified. I propose this change:
@@ -1340,7 +1340,8 @@
}
TAILQ_INSERT_HEAD(&bufqueues[bp->b_qindex], bp, b_freelist);
/* buffers with junk contents */
- } else if (bp->b_flags & (B_INVAL | B_NOCACHE | B_RELBUF) ||
+ } else if (bp->b_flags & (B_INVAL | B_RELBUF) ||
+ ((bp->b_flags & (B_NOCACHE | B_DELWRI)) == B_NOCACHE)
(bp->b_ioflags & BIO_ERROR)) {
bp->b_flags |= B_INVAL;
bp->b_xflags &= ~(BX_BKGRDWRITE | BX_ALTDATA);
This only enters the path to invalidate the buffer in case of B_NOCACHE
being set, if B_DELWRI is *not* set.
Can somebody reconstruct my analysis and confirm/reject it?
Does my proposed solution for the second problem look sensible?
Are the two problems related, i.e. can the first problem only occur
because the second problem exists?
What about the change in r93707 in general? Is it wrong? Does my
proposal remove the cause for the change done in r93707 or are there
other circumstances by which the situation can be triggered?
Hopefully somebody can shed some more light on this.
Christoph
More information about the freebsd-hackers
mailing list