NFS exports atomic and on-the-fly atomic updates

Andrey Simonenko simon at comsys.ntu-kpi.kiev.ua
Tue Jun 2 15:43:00 UTC 2009


Hello,

Here I want to describe changes that allow to make atomic updates of
NFS exports lists, dynamic on-the-fly atomic updates of NFS exports
lists and improve security of NFS exports.

Solved tasks:
-------------

1. NFS export specifications (spec -- for short) updates are atomic.  NFS
   server's users will not get EACCES for exported file systems or wrong
   access rights while exports list is reloaded.

2. The mountd utility has an option for testing configuration, now one can
   check and see real configuration not loading it into nfsserver.

3. The mountd utility does not load incomplete settings to nfsserver,
   wrong configuration will not allow denied exports.

4. Atomic on-the-fly modifications of NFS export specifications were
   implemented, it is possible to change exports settings dynamically.

5. If some file system is mounted or unmounted, then sighup signal sending
   to mountd is not required, this change removes several race conditions.

6. NFS exports related code is a part of nfsserver/ code, NFS export related
   data can be removed from directories not related to NFS.

Which actions are atomic?
-------------------------

1. Loading export settings from exports(5) file into nfsserver is atomic for
   each exported file system and for all exported file systems.

2. Loading updates into nfsserver is atomic for each exported file system and
   for all exported file systems.

3. If a file system was mounted, then loading export settings for it into
   nfssever is atomic for this file system.

Which actions are not atomic?
-----------------------------

1. Loading export specifications for WebNFS file system and specifying
   WebNFS related settings require more than one system call.

2. Since VFS events such as mounting and unmounting are asynchronous,
   events for all exported and not exported file systems are checked
   as separate system calls.

Since nmount(2) is not used in this implementation and subdirectories exports
are not allowed, it is unlikely that these changes will be accepted.

If absence of insecure subdirectories exports is not a problem, then it
it is possible to support both existent mountd and new API on 7-STABLE
(see nfsserver/nfs_srvsubs.c:nfsrv_fhtovp() function patch).

The mountd utility was completely rewritten, actually the better name
for new utility with new properties would be "nfse".  The single source
file mountd.c was split into three .c and three .h files:  mountd.c
(previous code was rewritten and new code was added), mountd.h (new code),
mountd_conf.c (new code), mountd_conf.h (new code), mountd_xdr.c (previous
code was updated to support new data structure and options), mountd_xdr.h
(previous code).

The analog of kern/vfs_export.c was written from zero and now it is called
nfssever/nfs_export.c.

This version of mountd can be used on modified 7.2 system.
It was tested (except WebNFS related settings) on amd64 and i386 arch.

Support of 8.0 system is possible, it is necessary to modify the
nfs_export.c:nfse_check() function (~60 lines) and add new NFSv4 related
options, but since there are sys/nfssever/ and sys/fs/nfsserver/ that have
functions that call VFS_CHECKEXP and there is activity in NFS development,
it is unclear which arguments nfse_check() should accept.  It is better
to discuss arguments list and semantic of nfse_check() first.  I think this
is the only one function that does not allow to use these changes on 8.0
system.

Available patches have only necessary changes to sys/nfsserver/, sys/kern/,
sys/sys/ and sys/conf/.  Complete changes require removing all NFS exports
related data and code from directories not related to NFS.  Also nfs_pub
structure should be a part of the new nfsserver/nfs_export.c file.
I did not make all these changes (all file systems, all NFS related flags
in mount.h, etc.), just to show the main parts of changes.

The following text contains more or less detail description of changes
(definitely I forgot to mention something here).

Major improvements:
-------------------

* Now all export spec updates are atomic.  mountd uses nfssvc(2) for this
  (the new nfssvc(NFSSVC_EXPORT) call is used).  Now it is safe to reload
  exports files.

* New nfs_export.c file was added to nfsserver/, all API details are
  located in nfs_export.h.  All NFS related flags and structures are part
  of nfsserver.

* Now mountd uses kernel event EVFILT_FS to see mount and umount VFS events.
  The mount(8) utility should not send sighup signal to mountd any more.
  New EVENTHANDLERs were declared: vfs_mount_event and vfs_unmount_event.
  Registered function for these handlers are invoked when a file system is
  mounted or unmounted respectively.  The nfsserver uses these event handlers
  to synchronizes own data with available file systems.  Memory leak was
  removed when an exported file system is unmounted.  Now the nfssever
  understands covered file systems (file system mounted on mount point of
  another file system).  

* The mountd utility has a new option -c, that allows to modify export spec
  on-the-fly.  One can clear, add, update, delete export spec.  All updates
  are atomic.  One commands set works like a transaction with changes,
  it is applied completely or is not applied at all.

* Now mountd has the -t switch: parse configuration files or commands and
  output all settings to stdout.  This option allows to check and see real
  configuration.

Incompatible security changes:
------------------------------

* Now subdirectory export is disallowed.  Subdirectory export does not
  improve security, instead it is the right way for misconfiguration
  (export settings for a subdirectory can be completely unrelated to this
  subdirectory and does not protect access to another parts of exported
  file system).

  The nfsserver exports file systems, not directories, looks like that
  subdirectory export for NFS is too complex or impossible to implement
  completely.  Anyway there is nullfs.

  Having read RFCs, documentation for another NFS implementations and
  thoughts in user groups in Internet I think that this (radical)
  modification will improve security.  This version of mountd has the same
  logic of export rules as nfsserver has.

* Now mountd allows to mount file systems, subdirectories and regular files
  (if the -r flag is on) in exported file systems by default.  The -alldirs
  option became obsolete.

Compatible security changes:
----------------------------

* Ignoring exports files is not safe, since remote users can get wrong access
  rights.  Alternative compatible solution: all exports file must be present,
  a user can specify directory/ and all regular files from the given
  directories will be loaded (any directory can be absent).

* Now if mountd cannot correctly parse export specification for some file
  system, then it does not load anything to nfsserver for this file system.
  Ignoring something in exports file is not safe.

* Now security flavors are per address specification settings in nfsserver
  and mountd.

* In rare cases mountd completely ignore settings in exports file, and does
  not load anything into nfsserver (this can happen if mistake in
  configuration does not allow to finish file parsing).

Updates for mountd:
-------------------

* Now everywhere IPv4 and IPv6 addresses are used, since the kernel knows
  nothing about domain names, netgroups, etc.  Now mountdtab file contains
  only address, MOUNT protocol's procedure EXPORT and DUMP output addresses.
  This removes problems with reverse name resolving, but sometimes entries
  are not removed on unmount (depends on used address).

* Better output for MOUNT protocol's procedure EXPORT: host is an address,
  network is an address with prefix.

* Now mountdtab is parsed more carefully.

* Zone scope index checking was removed for IPv6 addresses, nfsserver does
  not check zone scope index.

* Now mountdtab is saved only when mountd exists, no other program in the base
  system uses this file.  The representation of mountdtab file's content in
  memory was optimized.

* Do not leave PID file if some error occurred and mountd exited.

* Allowed to use loopback addresses in the -h option. (I do not like design
  idea of -h, -p and similar options.)

* Corrected incorrect binding when -p option is not used (nobody saw this
  because this can happen very seldom, but I could reproduce this error).

* Wrong implementation of mask creation when prefix length is given as
  /prefixlength was corrected.

Updates for nfsserver:
----------------------

* Previous nfsserver could access released memory returned by VFS_CHECKEXP.
  New code does have this problem.

Updates for exports(5):
-----------------------

* Added new option -host to allow to use host names and the same netgroups
  names at once.

* Added new option -rw: read-write access.

* Added flag `!' for hosts and networks, this flag means "deny access".

* Added new line "options: ...", right now it is used for global -sec,
  -no_mntproc_dump and -no_mntproc_export options, later it can be used for
  NFSv4.

* Added new option -nospec, that means "this line does not have any address
  specifications".

* Added new option -no_mntproc_dump to disable MOUNT protocol's procedure
  DUMP.

* Added new option -no_mntproc_export to disable MOUNT protocol's procedure
  EXPORT.

* exports(5) says that -o is the only one compatible option.  Actually there
  are others: -root and -r for -maproot, -m for -mask and -n for -network.
  Now mountd logs a warning message if an obsolete option is used.

* Do not allow to use any option between -network and -mask.

* Now #-comment can be anywhere in a line. 

* \xxx octal number can be used in directories names and option's arguments
  for representing an arbitrary character.

* Now it is possible to mix hosts and netgroups with networks:
  "host1 -network=somenetwork host2".

* Now it is possible to change options for particular host/network in one
  line: "-ro -mapall=user1 host1 -mapall=user2 host2" (host2 will inherit
  previous option -ro, but will get new -mapall option).  Since previously
  exports(5) says that options must be given before hosts and networks, this
  change is backward compatible and allows to represent all settings for one
  file system in one logical line.

* Content of exports(5) was simplified and updated.

Open questions and tasks:
-------------------------

* There must be a global solution to check whether it is possible to unload
  a KLD module when no process currently is working with its syscalls.

* Looks like that first argument for nfssvc(2) is not a set of flags any more
  (according to STABLE, CURRENT and NFSv4 implementation).  May be there is
  a sense to make it a value, not flags.

* WebNFS related data in nfsserver is protected when export settings are set
  (in vfs_export.c and new nfs_export.c), but when other parts of nfsserver
  access WebNFS related data no synchronization is performed.

* If a file system cannot be exported in NFS, then there must be some flag
  to indicate this (MNT_NFSEXPORTABLE or something more general, see
  fs/msdosfs/msdosfs_vfsops.c for example).

* Should signals be checked more often in mountd?  Right now signals are
  checked when mountd is waiting for RPC request, or if nfssvc's commands
  transaction timeout occurred.  Previous code has race conditions with
  signals and does too many things disallowed by SUSv3 in signal handlers.

* Should be there any limitation in nfsserver on number of export
  specifications and number of command transactions?

* May be mountd should be renamed to something another, eg. nfse.  NFSv4
  does not use MNT procedure, but still needs utility for configuring
  access rights as I understand.  Also nfse command name is more obvious
  for -c commands, eg. "nfse -c 'add /fs -ro'".  As well /etc/exports can
  be renamed to /etc/nfs.exports, /var/db/mountdtab -> /var/db/nfse.mounts.
  Also such renaming will allow to use mountd and new nfse in 7-STABLE at
  the same time and mount(8) from 7-STABLE will not send SIGHUP to nfse,
  since its PID will be saved in the nfse.pid file.

* netgroup.5 can be moved to src/lib/libc/gen/ or src/share/man/man5, this
  documentation is not part of mountd.

Examples:
---------

1. Correct file:
 
exports file:

options: -no_mntproc_dump
/fs -host 1.1.1.1 -ro 2.2.2.2
/fs -network 10.20.30.40
/home -mapall nobody -network 10/8 -mapall operator -network 20.1/8

"mountd -t" output:

configure: reading file exports
Global options:
    -no_mntproc_dump

Directory /fs
    Export specifications:
	-ro -sec sys -maproot=-2:-2 -host 2.2.2.2
	-rw -sec sys -maproot=-2:-2 -host 1.1.1.1
	-rw -sec sys -maproot=-2:-2 -network 10.0.0.0/8

Directory /home (mount point)
    Export specifications:
	-rw -sec sys -mapall 2:5 -network 20.0.0.0/8
	-rw -sec sys -mapall 65534:65534 -network 10.0.0.0/8

2. Wrong file:

exports:

/fs -ro 1.1.1.1
/fs -network 10/8 -host 1.1.1.1
/home -quiet -ro

"mountd -t" output:

configure: reading file exports
parsing error: exports:2: duplicated address specification 1.1.1.1 was found
in this line
parsing error: exports:2: -host option's argument parsing failed

Directory /fs
    Wrong configuration

Directory /home (mount point)
    File system options:
	-quiet
    Export specifications:
	-ro -sec sys -maproot=-2:-2 (default)

3. Commands testing:

mountd -t -c 'add /fs -ro -mapall nobody -host 1.1.1.1 -network !10/8' \
    -c 'flush /home' -c 'update /usr -ro -mapall operator'

configure: parsing -c commands

Directory /fs
    Commands:
	-c    add -ro -sec sys -mapall 65534:65534 -host 1.1.1.1
	-c    add -ro -sec sys -mapall 65534:65534 -network !10.0.0.0/8

Directory /home (mount point)
    Commands:
	-c  flush

Directory /usr (mount point)
    Commands:
	-c update -ro -sec sys -mapall 2:5 (default)

4. Specifying exports(5) files

mountd /etc/exports /etc/local.exports /usr/local/nfs-export/

Files /etc/exports and /etc/local.exports must be present.
The nfs-export directory can be absent, if it present or will be
present, then all regular files from it are read.

Sources
-------

http://comsys.ntu-kpi.kiev.ua/~simon/nfse/
MD5 (nfse-20090602.tar.bz2) = 8670b95fcfb7a80433afa4db43143418


More information about the freebsd-hackers mailing list