msdosfs not MPSAFE

Bruce Evans brde at optusnet.com.au
Mon Jul 16 10:18:20 UTC 2007


On Thu, 12 Jul 2007, Kostik Belousov wrote:

> On Thu, Jul 12, 2007 at 11:33:40PM +1000, Bruce Evans wrote:
>>
>> On Thu, 12 Jul 2007, Kostik Belousov wrote:
>>
>>> On Wed, Jul 11, 2007 at 12:08:19AM +1000, Bruce Evans wrote:
>>>> msdsofs has been broken since Giant locking for file systems (or syscalls)
>>>> was removed.  It allows multiple threads to race accessing the shared
>>>> static buffer `nambuf' and related variables.  This causes remarkably
>>
>>> It seems that msdosfs_lookup() can sleep, thus Giant protection would be
>>> lost.
>>
>> It can certainly block in bread().
> Besides bread(), there is a (re)locking for ".." case, and deget() call,
> that itself calls malloc(M_WAITOK), vfs_hash_get(), getnewvnode() and
> readep(). The latter itself calls bread().
>
> This is from the brief look.

I think msdosfs_lookup() doesn't need to own nambuf near the deget()
call.  Not sure -- I was looking more at msdosfs_readdir().

>> How does my adding Giant locking help?  I checked that at least in
>> FreeBSD-~5.2-current, msdosfs_readdir() is already Giant-locked, so my
>> fix just increments the recursion count.  What happens to recursively-
>> held Giant locks across sleeps?  I think they should cause a KASSERT()
>> failure, but if they are handled by only dropping Giant once then my
>> fix might sort of work but sleeps would be broken generally.
>>
> Look at the kern/kern_sync.c:_sleep(). It does DROP_GIANT(), that (from
> the sys/mutex.h) calls mtx_unlock() until Giant is owned.

So it is very mysterious that Giant locking helped.  Anyway, it doesn't
work, and cases where it doesn't help showed up in further testing.

sx xlocking works, but is not quite right:

% Index: msdosfs_denode.c
% ===================================================================
% RCS file: /home/ncvs/src/sys/fs/msdosfs/msdosfs_denode.c,v
% retrieving revision 1.73
% diff -u -2 -r1.73 msdosfs_denode.c
% --- msdosfs_denode.c	16 Jun 2004 09:47:03 -0000	1.73
% +++ msdosfs_denode.c	13 Jul 2007 04:58:35 -0000
% @@ -52,10 +52,12 @@
%  #include <sys/systm.h>
%  #include <sys/kernel.h>
% +#include <sys/lock.h>
%  #include <sys/mount.h>
%  #include <sys/malloc.h>
% +#include <sys/mutex.h>
%  #include <sys/bio.h>
%  #include <sys/buf.h>
% +#include <sys/sx.h>

Include this; other changes in this hunk unrelated.

%  #include <sys/vnode.h>
% -#include <sys/mutex.h>
% 
%  #include <vm/vm.h>
% @@ -68,9 +70,11 @@
%  #include <fs/msdosfs/fat.h>
% 
% +struct sx mbnambuf_lock;
% +

Declare this.  This file has nothing to do with nambuf, but it is the only
convenient place to add the init and destroy calls.  In -current, there
.vfs_init and .vfs_uninit hooks no longer exist, so the patch would be
even larger.

%  static MALLOC_DEFINE(M_MSDOSFSNODE, "MSDOSFS node", "MSDOSFS vnode private part");
% 
%  static struct denode **dehashtbl;
%  static u_long dehash;			/* size of hash table - 1 */
% -#define	DEHASH(dev, dcl, doff)	(dehashtbl[(minor(dev) + (dcl) + (doff) / 	\
% +#define	DEHASH(dev, dcl, doff)	(dehashtbl[(minor(dev) + (dcl) + (doff) / \
%  				sizeof(struct direntry)) & dehash])
%  static struct mtx dehash_mtx;

Unrelated cleanup.

% @@ -117,4 +121,5 @@
%  	dehashtbl = hashinit(desiredvnodes/2, M_MSDOSFSMNT, &dehash);
%  	mtx_init(&dehash_mtx, "msdosfs dehash", NULL, MTX_DEF);
% +	sx_init(&mbnambuf_lock, "mbnambuf");
%  	return (0);
%  }
% @@ -128,4 +133,5 @@
%  		free(dehashtbl, M_MSDOSFSMNT);
%  	mtx_destroy(&dehash_mtx);
% +	sx_destroy(&mbnambuf_lock);
%  	dehash_init--;
%  	return (0);
% Index: msdosfs_lookup.c
% ===================================================================
% RCS file: /home/ncvs/src/sys/fs/msdosfs/msdosfs_lookup.c,v
% retrieving revision 1.40
% diff -u -2 -r1.40 msdosfs_lookup.c
% --- msdosfs_lookup.c	26 Dec 2003 17:24:37 -0000	1.40
% +++ msdosfs_lookup.c	13 Jul 2007 06:13:04 -0000
% @@ -54,4 +54,5 @@
%  #include <sys/bio.h>
%  #include <sys/buf.h>
% +#include <sys/sx.h>
%  #include <sys/vnode.h>
%  #include <sys/mount.h>
% @@ -63,4 +64,6 @@
%  #include <fs/msdosfs/fat.h>
% 
% +extern struct sx mbnambuf_lock;
% +
%  /*
%   * When we search a directory the blocks containing directory entries are

This shouldn't be extern.

% @@ -78,11 +81,11 @@
%   * memory denode's will be in synch.
%   */
% -int
% -msdosfs_lookup(ap)
% +static int
% +msdosfs_lookup_locked(
%  	struct vop_cachedlookup_args /* {
%  		struct vnode *a_dvp;
%  		struct vnode **a_vpp;
%  		struct componentname *a_cnp;
% -	} */ *ap;
% +	} */ *ap)
%  {
%  	struct vnode *vdp = ap->a_dvp;
% @@ -560,4 +564,20 @@
% 
%  /*
% + * XXX msdosfs_lookup() is split up because unlocking before all the returns
% + * in the original function would be too churning.
% + */
% +int
% +msdosfs_lookup(ap)
% +	struct vop_cachedlookup_args *ap;
% +{
% +	int error;
% +
% +	sx_xlock(&mbnambuf_lock);
% +	error = msdosfs_lookup_locked(ap);
% +	sx_xunlock(&mbnambuf_lock);
% +	return (error);
% +}
% +
% +/*
%   * dep  - directory entry to copy into the directory
%   * ddep - directory to add to
% Index: msdosfs_vnops.c
% ===================================================================
% RCS file: /home/ncvs/src/sys/fs/msdosfs/msdosfs_vnops.c,v
% retrieving revision 1.147
% diff -u -2 -r1.147 msdosfs_vnops.c
% --- msdosfs_vnops.c	4 Feb 2004 21:52:53 -0000	1.147
% +++ msdosfs_vnops.c	15 Jul 2007 04:07:36 -0000
% @@ -60,4 +62,5 @@
%  #include <sys/proc.h>
%  #include <sys/mount.h>
% +#include <sys/sx.h>
%  #include <sys/unistd.h>
%  #include <sys/vnode.h>
% @@ -70,14 +73,14 @@
%  #include <vm/vnode_pager.h>
% 
% -#include <machine/mutex.h>
% -
%  #include <fs/msdosfs/bpb.h>
% -#include <fs/msdosfs/msdosfsmount.h>
%  #include <fs/msdosfs/direntry.h>
%  #include <fs/msdosfs/denode.h>
%  #include <fs/msdosfs/fat.h>
% +#include <fs/msdosfs/msdosfsmount.h>
% 
%  #define	DOS_FILESIZE_MAX	0xffffffff
% 
% +extern struct sx mbnambuf_lock;
% +

Declare this; other changes in this hunk unrelated.

%  /*
%   * Prototypes for MSDOSFS vnode operations
% @@ -1559,4 +1594,5 @@
%  	}
% 
% +	sx_xlock(&mbnambuf_lock);
%  	mbnambuf_init();
%  	off = offset;
% @@ -1687,4 +1727,6 @@
%  	}
%  out:
% +	sx_xunlock(&mbnambuf_lock);
% +
%  	/* Subtract unused cookies */
%  	if (ap->a_ncookies)

I didn't add this sx lock in subr_witness.c.  This patch is already too
large.  About half of sx locks seem to be missing in witness.

Please fix this better.

Bruce


More information about the freebsd-fs mailing list