releng/13 release/13.0.0 : odd/incorrect diff result over nfs (in a zfs file systems context)

Sun May 23 19:20:57 UTC 2021

On 2021-May-23, at 01:27, Mark Millard <marklmi at yahoo.com> wrote:

> On 2021-May-23, at 00:44, Mark Millard <marklmi at yahoo.com> wrote:
> 
>> On 2021-May-21, at 17:56, Rick Macklem <rmacklem at uoguelph.ca> wrote:
>> 
>>> Mark Millard wrote:
>>> [stuff snipped]
>>>> Well, why is it that ls -R, find, and diff -r all get file
>>>> name problems via genet0 but diff -r gets no problems
>>>> comparing the content of files that it does match up (the
>>>> vast majority)? Any clue how could the problems possibly
>>>> be unique to the handling of file names/paths? Does it
>>>> suggest anything else to look into for getting some more
>>>> potentially useful evidence?
>>> Well, all I can do is describe the most common TSO related
>>> failure:
>>> - When a read RPC reply (including NFS/RPC/TCP/IP headers)
>>> is slightly less than 64K bytes (many TSO implementations are
>>> limited to 64K or 32 discontiguous segments, think 32 2K
>>> mbuf clusters), the driver decides it is ok, but when the MAC
>>> header is added it exceeds what the hardware can handle correctly...
>>> --> This will happen when reading a regular file that is slightly less
>>>     than a multiple of 64K in size.
>>> or
>>> --> This will happen when reading just about any large directory,
>>>    since the directory reply for a 64K request is converted to Sun XDR
>>>    format and clipped at the last full directory entry that will fit within 64K.
>>> For ports, where most files are small, I think you can tell which is more
>>> likely to happen.
>>> --> If TSO is disabled, I have no idea how this might matter, but??
>>> 
>>>> I'll note that netstat -I ue0 -d and netstat -I genet0 -d
>>>> do not report changes in Ierrs or Idrop in a before vs.
>>>> after failures comparison. (There may be better figures
>>>> to look at for all I know.)
>>>> 
>>>> I tried "ifconfig genet0 -rxcsum -rxcsum -rxcsum6 -txcsum6"
>>>> and got no obvious change in behavior.
>>> All we know is that the data is getting corrupted somehow.
>>> 
>>> NFS traffic looks very different than typical TCP traffic. It is
>>> mostly small messages travelling in both directions concurrently,
>>> with some large messages thrown in the mix.
>>> All I'm saying is that, testing a net interface with something like
>>> bulk data transfer in one direction doesn't verify it works for NFS
>>> traffic.
>>> 
>>> Also, the large RPC messages are a chain of about 33 mbufs of
>>> various lengths, including a mix of partial clusters and regular
>>> data mbufs, whereas a bulk send on a socket will typically
>>> result in an mbuf chain of a lot of full 2K clusters.
>>> --> As such, NFS can be good at tickling subtle bugs it the
>>>    net driver related to mbuf handling.
>>> 
>>> rick
>>> 
>>>>> W.r.t. reverting r367492...the patch to replace r367492 was just
>>>>> committed to "main" by rscheff@ with a two week MFC, so it
>>>>> should be in stable/13 soon. Not sure if an errata can be done
>>>>> for it for releng13.0?
>>>> 
>>>> That update is reported to be causing "rack" related panics:
>>>> 
>>>> https://lists.freebsd.org/pipermail/dev-commits-src-main/2021-May/004440.html
>>>> 
>>>> reports (via links):
>>>> 
>>>> panic: _mtx_lock_sleep: recursed on non-recursive mutex so_snd @ /syzkaller/managers/i386/kernel/sys/modules/tcp/rack/../../../netinet/tcp_stacks/rack.c:10632
>>>> 
>>>> Still, I have a non-debug update to main building and will
>>>> likely do a debug build as well. llvm is rebuilding, so
>>>> the builds will take a notable time.
>> 
>> I got the following built and installed on the two
>> machines:
>> 
>> # uname -apKU
>> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #1 main-n246854-03b0505b8fe8-dirty: Sat May 22 16:25:04 PDT 2021     root at CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-dbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-DBG-CA72  arm64 aarch64 1400013 1400013
>> 
>> # uname -apKU
>> FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #1 main-n246854-03b0505b8fe8-dirty: Sat May 22 16:25:04 PDT 2021     root at CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-dbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-DBG-CA72  arm64 aarch64 1400013 1400013
>> 
>> Note that both are booted with debug builds of main.
>> 
>> Using the context with the alternate EtherNet device that has not
>> had an associated diff -r, find, pr ls -R failure yet
>> yet got a panic that looks likely to be unrelated:
>> 
>> # mount -onoatime 192.168.1.187:/usr/ports/ /mnt/
>> # diff -r /usr/ports/ /mnt/ | more
>> nvme0: cpl does not map to outstanding cmd
>> cdw0:00000000 sqhd:0020 sqid:0003 cid:007e p:1 sc:00 sct:0 m:0 dnr:0
>> panic: received completion for unknown cmd
>> cpuid = 3
>> time = 1621743752
>> KDB: stack backtrace:
>> db_trace_self() at db_trace_self
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>> vpanic() at vpanic+0x188
>> panic() at panic+0x44
>> nvme_qpair_process_completions() at nvme_qpair_process_completions+0x1fc
>> nvme_timeout() at nvme_timeout+0x3c
>> softclock_call_cc() at softclock_call_cc+0x124
>> softclock() at softclock+0x60
>> ithread_loop() at ithread_loop+0x2a8
>> fork_exit() at fork_exit+0x74
>> fork_trampoline() at fork_trampoline+0x14
>> KDB: enter: panic
>> [ thread pid 12 tid 100028 ]
>> Stopped at      kdb_enter+0x48: undefined       f904411f
>> db> 
>> 
>> Based on the "nvme" references, I expect this is tied to
>> handling the Optane 480 GiByte that is in the PCIe slot
>> and is the boot/only media for the machine doing the diff.
>> 
>> "db> dump" seems to have worked.
>> 
>> After reboot, zpool scrub found no errors.
>> 
>> So, trying again . . .
>> 
>> I got some "Expensive timeout(9) function" notices:
>> 
>> Expensive timeout(9) function: 0xffff000000717b64(0) 1.210285924 s
>> Expensive timeout(9) function: 0xffff000000717b64(0) 4.001010935 s
>> 
>> 0xffff000000717b64 looks to be uma_timeout:
>> 
>> ffff000000717b60 <uma_startup3+0x118> b ffff000000717b3c <uma_startup3+0xf4>
>> ffff000000717b64 <uma_timeout> stp      x29, x30, [sp, #-32]!
>> ffff000000717b68 <uma_timeout+0x4> stp  x20, x19, [sp, #16]
>> . . .
>> 
>> . . . Hmm. The debug kernel test context seems to take a
>> very long time. It has not failed so far but is still
>> going.
>> 
>> So I stopped it and switch to testing with the genet0 device
>> that was involved for the earlier failures.  . . .
>> 
>> It did not fail. Nor did the debug kernel report anything
>> beyond:
>> 
>> if_delmulti_locked: detaching ifnet instance 0xffffa00000fc8000
>> if_delmulti_locked: detaching ifnet instance 0xffffa00000fc8000
>> Expensive timeout(9) function: 0xffff00000050c088(0) 6.318652023 s
>> 
>> on one machine and:
>> 
>> if_delmulti_locked: detaching ifnet instance 0xffffa0000b56b800
>> 
>> on the other.
>> 
>> So I may reboot into the also-updated non-debug builds on both
>> machines and try in that context.
>> 
> 
> The non-debug build pair of machines got the problem:
> 
> # diff -r /usr/ports/ /mnt/ | more
> Only in /mnt/devel/electron12/files: 
> Only in /usr/ports/devel/electron12/files: patch-chrome_browser_media_webrtc_webrtc__logging__controller.cc
> Only in /usr/ports/devel/electron12/files: patch-components_previews_core_previews__features.cc
> Only in /mnt/devel/electron12/files: <A0><CE><C8>֏<DC>ܦ2<B2><E2><AA>^H
> Only in /mnt/www/chromium/files: patch-chrome_browser_chrome__browser
> Only in /usr/ports/www/chromium/files: patch-chrome_browser_chrome__browser__main__posix.cc
> 
> I'll note that it turns out that the debug build had more
> than is typical enabled: DIAGNOSTICS, BUF_TRACKING, and
> FULL_BUF_TRACKING were also enabled. I'd forgotten that
> I'd previously had a reason to add those to what my debug
> builds included (for a prior problem investigation). I'd
> not done debug builds in some time.

Without DIAGNOSTIC, BUF_TRACKING, and FULL_BUF_TRACKING
(so based on a more normal debug build on both sides),
the diff -r progressed at a more normal, sustained rate.

Yet . . .

# mount -onoatime 192.168.1.170:/usr/ports/ /mnt/
# diff -r /usr/ports/ /mnt/ | more
# 

In other words: no failure from the debug build.
Also no reports of anything by the debug kernel.
Multiple attempts (including some with reboots
between): same results.

So, it appears that only non-debug builds are
broken, for whatever reason.

For reference:

# uname -apKU
FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #2 main-n246854-03b0505b8fe8-dirty: Sun May 23 05:57:01 PDT 2021     root at CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-dbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-DBG-CA72  arm64 aarch64 1400013 1400013

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)