kern/122961: write operation on msdosfs file system causes panic
brde at optusnet.com.au
Mon Apr 21 23:49:35 UTC 2008
On Mon, 21 Apr 2008, Dominic Fandrey wrote:
> gavin at FreeBSD.org wrote:
>> To submitter: are you able to connect the USB stick to a machine
>> running Windows and run chkdsk, to confirm that the filesystem
>> is not invalid? (Although we should ideally be resiliant to
>> corrupt filesystems, if it still panics after a chkdisk then it's
>> a more serious problem...)
> I have already checked the stick under windows. Chkdisk did not find any
> problems, but the panic still occurs.
> The problem started after I updated RELENG_7 on my machine this weekend. The
> previous RELENG_7 build was ~2 months old.
This seems to be a bug in usb (umass) or the particular usb drive.
msdosfs now uses the drive's advertised max i/o size (mp->mnt_iosize_max)
to implement vfs clustering, but mnt_iosize_max seems to be broken for
some drives. This is only a theory because bug reporters never repond
to requests for more info.
Note that there are lots of bugs in the initialization of mp->mnt_iosize_max.
It is always MAXPHYS (128K), but few drives support this. Goem bogusly
splits up large i/o's into units that the drive claims to support
(d_maxsize). d_maxsize is bogusly initialized to the fixed value of
DFLTPHYS (64K) in many drivers including da. Bad things then happen if
a scsi drive doesn't actually support d_maxsize = 64K.
To check that this is the bug, mount msdosfs with -o noclusterr,noclusterw
under RELENG_7 or later (the bug also affects RELENG_6, but these mount
options are broken in RELENG_6). Then write and read some files, using
write() and not mmap(). (Use, dd or cp a file larger than 8M. cp always
uses mmap() for files smaller than 8M (a good pessimization if the file
is not in the buffer cache), and the nocluster* mount options don't affect
mmap() for any file system (another bug), and there is no option to prevent
cp using mmap().). Then remount without nocluster* and repeat. The bug
should only affect the repeat.
> # mount
> /dev/ufs/2root on / (ufs, local)
> devfs on /dev (devfs, local)
> /dev/ufs/2tmp on /tmp (ufs, local, soft-updates)
> /dev/ufs/2usr on /usr (ufs, NFS exported, local, soft-updates)
> /dev/ufs/2var on /var (ufs, local, soft-updates)
> pid874 at mobileKamikaze:/var/run/automounter.amd.mnt on
> /var/run/automounter.amd.mnt (nfs)
> /dev/msdosfs/APRIL RYAN on
> /var/run/automounter.mnt/msdosfs/bb8a40b99a061c33a35f4e7275d1842a (msdosfs,
> local, noatime, noexec)
The labels obfuscate the device type for all mountpoints very well.
Your backtrace showed a panic in mmap(). mmap() actually uses the
support for vfs clustering (VOP_BMAP()), not vfs clustering itself,
to determine the size of the largest contiguous i/o that is possible.
It's possible that the bug only affects mmap(), but I doubt it.
More information about the freebsd-bugs