ext2fs crash in -current (r218056)

Thu Feb 3 15:13:22 UTC 2011

On Thu, Feb 03, 2011 at 04:01:42PM +0200, Kostik Belousov wrote:
> On Thu, Feb 03, 2011 at 07:53:55AM -0500, John Baldwin wrote:
> > On Wednesday, February 02, 2011 5:20:23 pm Jeremy Chadwick wrote:
> > > On Wed, Feb 02, 2011 at 05:04:03PM -0500, John Baldwin wrote:
> > > > On Wednesday, February 02, 2011 04:13:48 pm Doug Barton wrote:
> > > > > I haven't had a chance to test this patch yet, but John's did not work
> > > > > (sorry):
> > > > > 
> > > > > http://dougbarton.us/ext2fs-crash-dump-2.jpg
> > > > > 
> > > > > No actual dump this time either.
> > > > > 
> > > > > I'm happy to test the patch below on Thursday if there is consensus that
> > > > > it will work.
> > > > 
> > > > Err, this is a different panic than what you reported earlier.  Your disk died 
> > > > and spewed a bunch of EIO errors.  I can look at the locking assertion failure 
> > > > tomorrow, but this is a differnt issue.  Even UFS needed a good bit of work to 
> > > > handle disks dying gracefully.
> > > 
> > > Are the byte offsets shown in the screenshot within the range of the
> > > drive's capacity?  They're around the 10.7GB mark, but I have no idea
> > > what size disk is being used.
> > > 
> > > The reason I ask is that there have been reported issues in the past
> > > where the offsets shown are way outside of the range of the permitted
> > > byte offsets of the disk itself (and in some cases even showing a
> > > negative number; what is it with people not understanding the difference
> > > between signed and unsigned types?  Sigh), and I want to make sure this
> > > isn't one of those situations.  I also don't know if underlying
> > > filesystem corruption could cause the problem in question ("filesystem
> > > says you should write to block N, which is outside of the permitted
> > > range of the device").
> > 
> > Just one comment.  UFS uses negative block numbers to indicate an indirect
> > block (or some such) as opposed to a direct block of data.  It's a purposeful
> > feature that allows one to instantly spot if a problem relates to a direct
> > block vs an indirect block.
> Yes, but the block numbers are negative within the vnode address range,
> not for the on-disk block numbers. ufs_bmap() shall translate negative
> vnode block numbers to the positive disk block numbers before buffer is
> passed down.

I'm a bit out of my league here (going entirely off of kernel source
code), but this is educational for me as well as (probably) others.
The error string being discussed is something like:

g_vfs_done():da0s2[WRITE(offset=10727313400, length=131072)]error = 5

The output comes from src/sys/geom/geom_vfs.c, function g_vfs_done():

 68 static void
 69 g_vfs_done(struct bio *bip)
 70 {
...
 84         if (bip->bio_error) {
 85                 printf("g_vfs_done():");
 86                 g_print_bio(bip);
 87                 printf("error = %d\n", bip->bio_error);
 88         }
...

g_print_bio() comes from src/sys/geom/geom_io.c, and prints the contents
based on what bip->bio_cmd would contain.  In this case, I believe it's
BIO_DELETE which is getting called (basing this on the case statement
output):

759 void
760 g_print_bio(struct bio *bp)
761 {
762         const char *pname, *cmd = NULL;
763
764         if (bp->bio_to != NULL)
765                 pname = bp->bio_to->name;
766         else
767                 pname = "[unknown]";
768
769         switch (bp->bio_cmd) {
...
780         case BIO_WRITE:
781                 if (cmd == NULL)
782                         cmd = "WRITE";
783         case BIO_DELETE:
784                 if (cmd == NULL)
785                         cmd = "DELETE";
786                 printf("%s[%s(offset=%jd, length=%jd)]", pname, cmd,
787                     (intmax_t)bp->bio_offset, (intmax_t)bp->bio_length);
...

The offset and the length are both explicitly casted and printed as
signed numbers here.

For me anyway, the next question is "what are bio_offset and bio_length
defined as?" (indirectly, "why the explicit cast?").  They're both
declared as part of struct bio in src/sys/sys/bio.h as shown:

 71 struct bio {
...
 78         off_t   bio_offset;             /* Offset into file. */
...
 92         off_t   bio_length;             /* Like bio_bcount */
...

Since I'm not familiar with the bio stuff, I can't determine if the
above printf() statement is actually correct or incorrect.  Ultimately,
of course, I'm trying to determine if "offset=XXX, length=XXX" actually
represent what folks think they would.

I'm now thinking the error message indicates something equivalent to "I
got EIO when attempting to work on file offset XXX, when writing or
reading length XXX bytes, and that file gets expanded to a device name
in this case.  Or do I have it wrong and the "file" is actually the disk
(filesystem) itself?  I imagine this is where the vfs stuff comes into
play... somehow.  *over my head*  :-)

-- 
| Jeremy Chadwick                                   jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.               PGP 4BD6C0CB |