testing/review of atomic export update patch

Konstantin Belousov kostikbel at gmail.com
Tue Sep 18 08:59:46 UTC 2012


On Mon, Sep 17, 2012 at 05:32:44PM -0400, Rick Macklem wrote:
> Konstantin Belousov wrote:
> > On Sun, Sep 16, 2012 at 05:41:25PM -0400, Rick Macklem wrote:
> > > Hi,
> > >
> > > There is a simple patch at:
> > >   http://people.freebsd.org/~rmacklem/atomic-export.patch
> > > that can be applied to a kernel + mountd, so that the new
> > > nfsd can be suspended by mountd while the exports are being
> > > reloaded. It adds a new "-S" flag to mountd to enable this.
> > > (This avoids the long standing bug where clients receive ESTALE
> > >  replies to RPCs while mountd is reloading exports.)
> > 
> > This looks simple, but also somewhat worrisome. What would happen
> > if the mountd crashes after nfsd suspension is requested, but before
> > resume was performed ?
> > 
> > Might be, mountd should check for suspended nfsd on start and
> > unsuspend
> > it, if some flag is specified ?
> Well, I think that happens with the patch as it stands.
> 
> suspend is done if the "-S" option is specified, but that is a no op
> if it is already suspended. The resume is done no matter what flags
> are provided, so mountd will always try and do a "resume".
> --> get_exportlist() is always called when mountd is started up and
>     it does the resume unconditionally when it completes.
>     If mountd repeatedly crashes before completing get_exportlist()
>     when it is started up, the exports will be all messed up, so
>     having the nfsd threads suspended doesn't seem so bad for this
>     case (which hopefully never happens;-).
> 
> Both suspend and resume are just no ops for unpatched kernels.
> 
> Maybe the comment in front of "resume" should explicitly explain
> this, instead of saying resume is harmless to do under all conditions?
> 
> Thanks for looking at it, rick
I see.

My another note is that there is no any protection against parallel
instances of suspend/resume happen. For instance, one thread could set
suspend_nfsd = 1 and be descheduled, while another executes resume
code sequence meantime. Then it would see suspend_nfsd != 0, while
nfsv4rootfs_lock not held, and tries to unlock it. It seems that
nfsv4_unlock would silently exit. The suspending thread resumes,
and obtains the lock. You end up with suspend_nfsd == 0 but lock held.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20120918/e932de0d/attachment.pgp


More information about the freebsd-fs mailing list