Re: An interesting anomaly in NFS client...

From: George Neville-Neil <gnn_at_neville-neil.com>
Date: Fri, 08 Nov 2024 17:47:59 UTC

On 8 Nov 2024, at 7:58, Rick Macklem wrote:

> On Thu, Nov 7, 2024 at 9:41 PM George Neville-Neil <gnn@neville-neil.com> wrote:
>>
>>
>>
>> On 7 Nov 2024, at 13:59, Rick Macklem wrote:
>>
>>> On Thu, Nov 7, 2024 at 9:34 AM George Neville-Neil <gnn@neville-neil.com> wrote:
>>>>
>>>>
>>>>
>>>> On 7 Nov 2024, at 4:15, Mark Saad wrote:
>>>>
>>>>>>
>>>>>> On Nov 7, 2024, at 12:29 AM, Andriy Gapon <avg@freebsd.org> wrote:
>>>>>>
>>>>>> On 07/11/2024 02:43, George Neville-Neil wrote:
>>>>>>> Howdy,
>>>>>>> We've been digging into an interesting possible issue in the FreeBSD NFS client. Here is the scenario. I have a FreeBSD VM on my Mac, the Mac is the NFS server, the VM is the client.
>>>>>
>>>>> What are you using to run the vm ? What architecture is the vm ? What about the Mac ?
>>>>
>>>> qemu, aarch64, M3 Mac.
>>>>
>>>> I doubt this is the source of the issue.
>>>>
>>>> I was poking through the code and I wonder if a slight time skew might be an issue.  I'm going to check into that.  The VM and the Mac both us NTP to stay in sync with the world, but who knows...
>>> Hi George,
>>>
>>> I'll take a look at the packet trace later, but...
>>>
>>> If you can easily reproduce the issue, do a:
>>> # nfsstat -E -c -z
>>> - before reproducing it, and a
>>> # nfsstat -E -c
>>> - after. Then look at the Cache Info: at the end of the output.
>>>
>>
>> I'll give that a look, and the thing that Mark found is also interesting.  I might ask Warner about it tomorrow, we're both at the Dev Summit.
> When I looked at the packet trace, I saw a lot of GETATTRs
> for different directories. If they are different directories and not
> the same ones over and over again, caching will not be the issue.
> (Btw, the attribute caching code hasn't changed in decades, afaik.)
>

Looks like the answer is what Mark sent, and I talked to Warner and what we do now is, if not great, still the right thing, and just isn't so happy on NFS.  We use NFS in our work on kernel development because we develop on VMs to start. Other than this pause, world builds on a modern (M3) laptop are as fast on an average server (hurray SoCs) and when the thing crashes it reboots in seconds, rather than 10 minutes which is how long a modern Dell server takes to do its hardware checks.

The shorter answer from some folks is "use 9pfs because NFS (server) on MacOS is sloooow" which I'll look into as well.

Thanks for all the help, it's been an interesting journey ;-)

> Have fun at the dev summit, rick
>

Doing our best!

Best,
George