Re: nfs hang

From: Rick Macklem <rick.macklem_at_gmail.com>
Date: Fri, 14 Nov 2025 17:58:16 UTC
On Fri, Nov 14, 2025 at 5:09 AM Ronald Klop <ronald-lists@klop.ws> wrote:
>
>
> Van: Rick Macklem <rick.macklem@gmail.com>
> Datum: vrijdag, 14 november 2025 12:53
> Aan: Mark Millard <marklmi@yahoo.com>
> CC: Ronald Klop <ronald@freebsd.org>, freebsd-fs@freebsd.org
> Onderwerp: Re: nfs hang
>
> On Thu, Nov 13, 2025 at 4:51PM Mark Millard <marklmi@yahoo.com> wrote:
> >
> > Ronald Klop <ronald_at_FreeBSD.org> wrote on
> > Date: Thu, 13 Nov 2025 17:17:48 UTC :
> >
> > > Op 13-11-2025 om 14:06 schreef Rick Macklem:
> > > > On Thu, Nov 13, 2025 at 2:45AM Ronald Klop <ronald@freebsd.org> wrote:
> > > >>
> > > >> Op 13-11-2025 om 11:41 schreef Ronald Klop:
> > > >>> . . .
> > > >>>
> > > >>>
> > > >>
> > > >>
> > > >> . . .
> > > > Do you have more than one client mounting the file system?
> > > > If you do, make sure they all have different /etc/hostid's.
> > > > (Cloning a system disk without deleting /etc/hostid can
> > > > result in multiple clients with the same /etc/hostid. That
> > > > mean they are "the same client" to the NFSv4 server
> > > > and that can cause the above.)
> > > >
> > > > If this is not the problem, I don't know why you'd see the
> > > > above but I suspect the above explains the hang.
> > > >
> > > > rick
> > > >
> > >
> > >
> > > Two clients. Both have different /etc/hostid.
> > >
> > > I noticed that the procstat stacks start with "null_reclaim". And poudriere null-mounts the nfs mounts in the poudriere-jails.
> >
> > Do the poudriere jails on each host use the host's /etc/hostid (by content)?
> >
> > Any worries about needing poudriere jail /etc/hostid content uniqueness?
> From NFSv4's point of view, /etc/hostid is used at mount time.
> The client will use the one outside of any jail, since that is where the
> mount is done from.
>
> Now, although I was thinking about the client (since that is where
> the hangs occur), it might be an issue if you were running the nfsd
> in multiple jails that have the same /etc/hostid as well.
> --> The server identifies itself to the client via the /etc/hostid and
>      a different identity means different server.
>      However, it could be a problem if the servers in different jails
>      return the "same server" from the same /etc/hostid.
>
> I'll admit it as been years since I did the "run nfsd in a vnet jail"
> code and I don't remember if the use of /etc/hostid is vnet'd or
> not.
>
> rick
>
> >
> > > Could nfs+nullfs give some trouble? Or maybe it is just nullfs that hangs everything and the nfs stuff is just a result of it.
> > >
> > > At the same moment I had git hanging on a non-NFS mount. See attachment for the procstat which also includes nfs-calls.
> >
> > ===
> > Mark Millard
> > marklmi at yahoo.com
> >
> ________________________________
>
>
>
> I only have 1 nfs server.
> My vnet jail that runs nfsd does not have /etc/hostid.
>
> /etc/rc.d/hostid and hostid_save have the keyword nojail so they never run inside a jail (AFAIK).
> # sysctl kern.hostid
> kern.hostid: 0
>
> Is that an issue?
Certainly could be. You'd need to capture packets during a mount and
look at them in wireshark (wireshark knows how to decode NFS).
Then you'd look at the reply to the ExchangeID and see what the
server calls itself. (I have no idea what it will have filled in, given the
above.)

If you are running nfsd in multiple jails and they reply with the
same server name string, that will definitely break NFSv4.

rick

>
> Regards,
> Ronald.
>