kern/105964: Make MSDOSFS_LARGE a mount option
Oliver Fromme
olli at secnetix.de
Tue Nov 28 09:41:02 PST 2006
>Number: 105964
>Category: kern
>Synopsis: Make MSDOSFS_LARGE a mount option
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: change-request
>Submitter-Id: current-users
>Arrival-Date: Tue Nov 28 17:40:17 GMT 2006
>Closed-Date:
>Last-Modified:
>Originator: Oliver Fromme
>Release: n/a
>Organization:
secnetix GmbH & Co. KG
http://www.secnetix.de/bsd
>Environment:
n/a
>Description:
This problem has been discussed on the -stable mailing list,
an Craig Rodrigues <rodrigc at crodrigues.org> asked me to
submit a PR for this issue because he's interested to pick
it up. So here we go.
The FAT file system format doesn't support file ID numbers
(UFS/FFS calls them "inode numbers"). Therefore MSDOSFS has
to create such numbers somehow. Currently there are two
hacks for that purpose, with different drawbacks:
-1- (Default) Use the directory entry offset of the file
as the file ID number. Assume that the hole media is
divided into blocks the size of a directory entry
(32 bytes), and use that "block number" for the file
ID. Since file ID numbers (a.k.a. inodes) are 32 bit,
that algorithm will overflow above 32 * 2^32 = 128 GB.
If you try to mount a FAT file system larger than
128 GB, it will fail and print "disk too big, sorry".
-2- (With MSDOSFS_LARGE in the kernel) Maintain a table
that dynamically maps 64bit offsets (that are computed
like above) to 32bit ID numbers. This works for FAT
file systems of any size > 128 GB (the code falls back
to algorithm 1 for file systems < 128 GB).
Two drawbacks:
-A- If a large number of files is accessed, the table
will grow very big and consume much kernel memory.
It is possible that the machine panics when it
runs out of kernel memory.
-B- Since, the mapping is dynamic, file ID numbers may
be different when the file system is unmounted and
re-mounted. That will break NFS exports, because
NFS assumes that file ID numbers (which are used
for NFS handles) are constant.
It should be noted that those drawbacks only apply if
the file system is > 128 GB. For smaller file systems
the code will automatically use the simpler algorithm
described first. This is controlled by the flag
MSDOSFS_LARGEFS (different from MSDOSFS_LARGE!).
>How-To-Repeat:
Try to mount FAT file systems of various sizes and encounter
the situations mentioned above.
For testing and experimenting, you can easily use a md(4)
device and newfs_msdos(8) to create a 160 GB FAT disk:
# truncate -s 160000000000 testfat.img
# mdconfig -a -t vnode -f testfat.img
md1
# fdisk -BI /dev/md1
******* Working on device /dev/md1 *******
# newfs_msdos -s 312496317 -c 128 -h 254 -u 63 /dev/md1s1 orb
/dev/md1s1: 312458112 sectors in 2441079 FAT32 clusters (65536 bytes/cluster)
bps=512 spc=128 res=32 nft=2 mid=0xf0 spt=63 hds=254 hid=0 bsec=312496317 bspf=19071 rdcl=2 infs=1 bkbs=2
# mount -t msdos -o ro /dev/md1s1 /mnt
mount_msdosfs: /dev/md1s1: Invalid argument
# dmesg | tail -1
mountmsdosfs(): disk too big, sorry
(Note: The newfs_msdos command is not very fast. It will
take a few seconds.)
>Fix:
Unfortunately, there is no real fix known for the problem.
However, the problem is made worse by the fact that you
have to recompile your kernel and reboot in order to
enable the second hack (kernel option MSDOSFS_LARGE).
That aspect of the problem could be fixed by making
it a mount option instead of a kernel compile option,
essentially converting the #ifdef's to regular if's.
It has been considered to even enable the MSDOSFS_LARGE
code by default. However, because of the drawbacks (i.e.
possibility of a panic because of kernel memory usage, and
inability to NFS-export the file system) it should only be
used if specifically requested by the user.
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list