panic!("docallb") in nfsrv_docallback

Rick Macklem rmacklem at uoguelph.ca
Sat Sep 5 04:14:22 UTC 2020


Alan Somers wrote:
>I just saw this panic on a 12-stable machine.  Unfortunately, I don't have
>a core dump, just a stack trace.  It was serving NFS v4.0, with delegations
>enabled.  The clients were all Debian, with Linux 3.16.0.
>
>The proximal cause of the panic seems to be that the file had a write
>delegation issued to an unconfirmed client.  Root cause is harder to
>determine.  Did the kernel previously issue a delegation to an unconfirmed
>client?  Or did the client somehow change to an unconfirmed state after the
>delegation was issued, perhaps due to a race?
>
>It's hard to tell, but I don't see any checks for lc_flags &
>LCL_NEEDSCONFIRM in nfsrv_openctrl (which issues the delegations), so I'm
>guessing that that's the problem.
I guess I should have looked at the code before doing the last post.
The check is in nfsrv_getclient(), that is called by nfsrv_opencheck().
nfsrv_opencheck() - Checks to see if an Open is allowed.
nfsrv_openctrl() - Does the Open, assuming nfsrv_opencheck() determined it
  was allowed.

>  If so, then the event trace would look
>like this:
>
>1) Client Alice sends SETCLIENTID.  The server creates a client state
>structure
>   for her.
>_) Client Alice should've sent SETCLIENTID_CONFIRM, but doesn't.  Bad Alice!
>2) Client Alice sends OPEN for some file, and is issued a write delegation.
>   The server shouldn't have issued it, because Alice's client ID is
>   unconfirmed.  Bad server!
I don't think this can happen. From looking at the code, an NFSERR_EXPIRED
reply to the Open should have happened.

>3) Client Bob tries to do a GETATTR on that same file.
>4) In nfsrv_checkgetattr, the kernel finds a write delegation for that file,
>   owned by client Alice.
I think the server needs to check for LCL_NEEDSCONFIRM in here.
It gets the "clp" from the FH, but it "assumes" a confirmed ClientID.

I'll code up a patch to add this check to nfsrv_checkgetattr().

>5) The kernel tries to send a NFSV4OP_CBGETATTR callback to Alice, to see
>if the
>   file's attributes have changed.
>6) But Alice's client ID is unconfirmed.  Oh no!  Panic!
>
>Does this sound plausible?  Should there be a check for LCL_NEEDSCONFIRM
>somewhere around line 3166 in nfs_nfsdstate.c?  Grateful for any help.
Well, it doesn't appear that the Open could occur when the ClientID was
not confirmed.
--> The obvious case you listed above is caught by nfsrv_opencheck().
      Now, could a SetClientID happen between nfsrv_opencheck() and
      nfsrv_openctrl()?
      --> I don't think so. If you look at nfsrv_setclient() which does SetClientID,
             it grabs the nfsv4rootfs_lock, locking out all other nfsd threads.
             It can't acquire the lock while a "shared lock" (I called a refcnt) is
             held by any other nfsd thread and the thread doing an Open will
             hold the shared lock (refcnt).

So, I think the Open with delegation would have been issued when the
ClientID was confirmed.
--> Then I suspect the client did another SetClientID that put the ClientID
       back to unconfirmed (the obvious one is a client reboot).
       --> One quirk of SetClientID/SetClientIDConfirm is that old Open/Lock
             state cannot be discarded until it is confirmed, so the old Delegations
             would remain on the ClientID.
--> Then a client did a Getattr on the file that had the old Delegation still
       allocated to it.

I'd definitely say that nfsrv_checkgetattr() needs to check for an unconfirmed
ClientID and return without attempting a callback.
--> The exclusive lock on nfsv4rootfs_lock acquired when doing SetClientID
       should also guarantee that, if the ClientID is confirmed when
       nfsrv_checkgetattr() tests for it, it will remain that way until after the
       callback is completed.

I'll come up with a patch and stick it on phabricator, rick



-Alan

P.S.: stack trace

kdb_backtrace
vpanic
panic
nfsrv_docallback
nfsrv_checkgetattr
nfsrvd_getattr
nfsrvd_dorpc
nfssvc_program
svc_run_internal
svc_thread_start
fork_exit
fork_trampoline
_______________________________________________
freebsd-hackers at freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org"



More information about the freebsd-hackers mailing list