[Bug 266014] panic: on long running find / plus

From: <bugzilla-noreply_at_freebsd.org>
Date: Wed, 24 Aug 2022 08:42:54 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=266014

            Bug ID: 266014
           Summary: panic: on long running find /    plus
           Product: Base System
           Version: 13.1-RELEASE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: dpy@pobox.com

panic: VERIFY3(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED,
&zp->z_sa_hdl)) failed (0 == 5)   


I have suddenly started consistently having a panic and crash on my overnight
run.  This can be triggered by a find / (with various parameter) plus the
overnight security run etc.

I can now trigger a crash a will (after 10 minutes or so).  I had trouble
initially get a crash dump, but now have one but am having problems with kgdb:

-------------

triple0# kgdb /boot/kernel/kernel /var/crash/vmcore.6

GNU gdb (GDB) 12.1 [GDB v12.1 for FreeBSD]
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd13.1".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...
Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...
/usr/ports/devel/gdb/work-py39/gdb-12.1/gdb/thread.c:1328: internal-error:
switch_to_thread: Assertion `thr != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
----- Backtrace -----
0x12c6931 ???
0x16d2678 ???
0x16d24f8 ???
0x1b21d63 ???
0x16971b8 ???
0x13e32ee ???
0x167841f ???
0x12fb704 ???
0x16a3392 ???
0x1498ac5 ???
0x149748b ???
0x1495e8e ???
0x11f317b ???
---------------------
/usr/ports/devel/gdb/work-py39/gdb-12.1/gdb/thread.c:1328: internal-error:
switch_to_thread: Assertion `thr != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)

----------------------------

standard kernel (i.e. freebsd-update ...)

triple0# uname -a
FreeBSD triple0.internal 13.1-RELEASE-p1 FreeBSD 13.1-RELEASE-p1 GENERIC amd64

kgdb installed from pkg, then from ports, still same problem.

================================================

Kernel panic details:

Since I haven't been able to get the debugger to work, I have instead managed a
photo of the stack backtrace on screen:

Basically:

#0 kdb_backtrace+0x65
#1 vpanic+0x17f
#2 spl_panic+0x3a
#4 zfs_znode_alloc+0x522
#4 zfs_zget+0x3b5
#5 zfs_dirent_lookup+0x16b
#6 zfs_dirlook+0x7a
#7 zfs_lookup+0x3d0
#8 zfs_freebsd_cachedlookup+0x3d0
#9 vfs_cache_lookup+0xad
#10 VOP_LOOKUP+030
#11 cache_fplookup_noentry+0x1a3
#12 cache_fplookup+0x366
#13 namei+0x12a
#14 kern_statat+0xf3
#15 sys_fstatat+0x2f
#16 amd64_syscall+0x10c
#17 fast_syscall_common+0xf8

I have attached the photo.

Hope someone can point me in the right direction (kgdb etc)

Thanks
Duncan

------
------

cat /var/crash/info.6
Dump header from device: /dev/da15p1
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 4367545370
  Blocksize: 512
  Compression: zstd
  Dumptime: 2022-08-24 14:33:16 +1000
  Hostname: triple0.internal
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 13.1-RELEASE-p1 GENERIC
  Panic String: VERIFY3(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp,
SA_HDL_SHARED, &zp->z_sa_hdl)) failed (0 == 5)

  Dump Parity: 1622049469
  Bounds: 6
  Dump Status: good


---------
---------

cat /var/crash/core.txt.6
r/crash/vmcore.6.zst : Read error (39) : premature end
triple0.internal dumped core - see /var/crash/vmcore.6

Wed Aug 24 14:52:35 AEST 2022

FreeBSD triple0.internal 13.1-RELEASE-p1 FreeBSD 13.1-RELEASE-p1 GENERIC  amd64

panic: VERIFY3(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED,
&zp->z_sa_hdl)) failed (0 == 5)

GNU gdb (GDB) 12.1 [GDB v12.1 for FreeBSD]
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd13.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...
Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...
/wrkdirs/usr/ports/devel/gdb/work-py39/gdb-12.1/gdb/thread.c:1328:
internal-error: switch_to_thread: Assertion `thr != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) [answered Y; input not from terminal]

This is a bug, please report it.  For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.

/wrkdirs/usr/ports/devel/gdb/work-py39/gdb-12.1/gdb/thread.c:1328:
internal-error: switch_to_thread: Assertion `thr != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n) [answered Y; input not from terminal]
Abort trap (core dumped)


------------------------------------------------------------------------
ps -axlww

ps: can't read nprocs

------------------------------------------------------------------------
vmstat -s

vmstat: vm_cnt:

------------------------------------------------------------------------
vmstat -m

vmstat: memstat_kvm_malloc: _kvm_vnet_selectpid: dumptid
         Type InUse MemUse Requests  Size(s)

------------------------------------------------------------------------
vmstat -z

vmstat: memstat_kvm_uma: KVM short read
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN

------------------------------------------------------------------------
vmstat -i

vmstat: sintrnames:

------------------------------------------------------------------------
pstat -T

pstat: kvm_read():

------------------------------------------------------------------------
pstat -s

Device          1K-blocks     Used    Avail Capacity

------------------------------------------------------------------------
iostat

iostat: devstat_checkversion: userland devstat version 6 is not the same as the
kernel
devstat_checkversion: devstat version 0
devstat_checkversion: libdevstat newer than kernel


------------------------------------------------------------------------
ipcs -a

ipcs: msginfo:

------------------------------------------------------------------------
ipcs -T

ipcs: msginfo:

------------------------------------------------------------------------
netstat -s


------------------------------------------------------------------------
netstat -m

netstat: memstat_kvm_all: _kvm_vnet_selectpid: dumptid

------------------------------------------------------------------------
netstat -anA

netstat: _kvm_vnet_selectpid: dumptid
netstat: _kvm_vnet_selectpid: dumptid
netstat: _kvm_vnet_selectpid: dumptid
netstat: _kvm_vnet_selectpid: dumptid
netstat: _kvm_vnet_selectpid: dumptid
netstat: _kvm_vnet_selectpid: dumptid
netstat: _kvm_vnet_selectpid: dumptid
netstat: _kvm_vnet_selectpid: dumptid
netstat: _kvm_vnet_selectpid: dumptid

------------------------------------------------------------------------
netstat -aL

netstat: _kvm_vnet_selectpid: dumptid
netstat: _kvm_vnet_selectpid: dumptid
netstat: _kvm_vnet_selectpid: dumptid
netstat: _kvm_vnet_selectpid: dumptid
netstat: _kvm_vnet_selectpid: dumptid
netstat: _kvm_vnet_selectpid: dumptid
netstat: _kvm_vnet_selectpid: dumptid
netstat: _kvm_vnet_selectpid: dumptid
netstat: _kvm_vnet_selectpid: dumptid

------------------------------------------------------------------------
fstat

fstat: procstat_getprocs()

------------------------------------------------------------------------
dmesg

dmesg: kvm_read:

------------------------------------------------------------------------
kernel config

options CONFIG_AUTOGENERATED
ident   GENERIC
machine amd64
cpu     HAMMER
makeoptions     WITH_CTF=1
makeoptions     DEBUG=-g
options IICHID_SAMPLING
options HID_DEBUG
options EVDEV_SUPPORT
options XENHVM
options USB_DEBUG
options ATH_ENABLE_11N
options AH_AR5416_INTERRUPT_MITIGATION
options IEEE80211_SUPPORT_MESH
options IEEE80211_DEBUG
options SC_PIXEL_MODE
options VESA
options PPS_SYNC
options COMPAT_LINUXKPI
options PCI_IOV
options PCI_HP
options IOMMU
options EARLY_AP_STARTUP
options SMP
options NETGDB
options NETDUMP
options DEBUGNET
options ZSTDIO
options GZIO
options EKCD
options KDB_TRACE
options KDB
options RCTL
options RACCT_DEFAULT_TO_DISABLED
options RACCT
options INCLUDE_CONFIG_FILE
options DDB_CTF
options KDTRACE_HOOKS
options KDTRACE_FRAME
options MAC
options CAPABILITIES
options CAPABILITY_MODE
options AUDIT
options HWPMC_HOOKS
options KBD_INSTALL_CDEV
options PRINTF_BUFR_SIZE=128
options _KPOSIX_PRIORITY_SCHEDULING
options SYSVSEM
options SYSVMSG
options SYSVSHM
options STACK
options KTRACE
options SCSI_DELAY=5000
options COMPAT_FREEBSD12
options COMPAT_FREEBSD11
options COMPAT_FREEBSD10
options COMPAT_FREEBSD9
options COMPAT_FREEBSD7
options COMPAT_FREEBSD6
options COMPAT_FREEBSD5
options COMPAT_FREEBSD4
options COMPAT_FREEBSD32
options EFIRT
options GEOM_LABEL
options GEOM_RAID
options TMPFS
options PSEUDOFS
options PROCFS
options CD9660
options MSDOSFS
options NFS_ROOT
options NFSLOCKD
options NFSD
options NFSCL
options MD_ROOT
options QUOTA
options UFS_GJOURNAL
options UFS_DIRHASH
options UFS_ACL
options SOFTUPDATES
options FFS
options KERN_TLS
options SCTP_SUPPORT
options TCP_RFC7413
options TCP_HHOOK
options TCP_BLACKBOX
options TCP_OFFLOAD
options ROUTE_MPATH
options IPSEC_SUPPORT
options INET6
options INET
options VIMAGE
options PREEMPTION
options NUMA
options SCHED_ULE
options NEW_PCIB
options GEOM_PART_GPT
options GEOM_PART_MBR
options GEOM_PART_EBR
options GEOM_PART_BSD
device  isa
device  mem
device  io
device  uart_ns8250
device  cpufreq
device  acpi
device  smbios
device  pci
device  fdc
device  ahci
device  ata
device  mvs
device  siis
device  ahc
device  ahd
device  esp
device  hptiop
device  isp
device  mpt
device  mps
device  mpr
device  sym
device  isci
device  ocs_fc
device  pvscsi
device  scbus
device  ch
device  da
device  sa
device  cd
device  pass
device  ses
device  amr
device  arcmsr
device  ciss
device  iir
device  ips
device  mly
device  twa
device  smartpqi
device  tws
device  aac
device  aacp
device  aacraid
device  ida
device  mfi
device  mlx
device  mrsas
device  pmspcv
device  twe
device  nvme
device  nvd
device  vmd
device  atkbdc
device  atkbd
device  psm
device  kbdmux
device  vga
device  splash
device  sc
device  vt
device  vt_vga
device  vt_efifb
device  vt_vbefb
device  agp
device  cbb
device  pccard
device  cardbus
device  uart
device  ppc
device  ppbus
device  lpt
device  ppi
device  puc
device  iflib
device  em
device  igc
device  ix
device  ixv
device  ixl
device  iavf
device  ice
device  vmx
device  axp
device  bxe
device  le
device  ti
device  mlx5
device  mlxfw
device  mlx5en
device  miibus
device  ae
device  age
device  alc
device  ale
device  bce
device  bfe
device  bge
device  cas
device  dc
device  et
device  fxp
device  gem
device  jme
device  lge
device  msk
device  nfe
device  nge
device  re
device  rl
device  sge
device  sis
device  sk
device  ste
device  stge
device  vge
device  vr
device  xl
device  wlan
device  wlan_wep
device  wlan_ccmp
device  wlan_tkip
device  wlan_amrr
device  an
device  ath
device  ath_pci
device  ath_hal
device  ath_rate_sample
device  ipw
device  iwi
device  iwn
device  malo
device  mwl
device  ral
device  wpi
device  crypto
device  aesni
device  loop
device  padlock_rng
device  rdrand_rng
device  ether
device  vlan
device  tuntap
device  md
device  gif
device  firmware
device  xz
device  bpf
device  uhci
device  ohci
device  ehci
device  xhci
device  usb
device  ukbd
device  umass
device  sound
device  snd_cmi
device  snd_csa
device  snd_emu10kx
device  snd_es137x
device  snd_hda
device  snd_ich
device  snd_via8233
device  mmc
device  mmcsd
device  sdhci
device  rtsx
device  virtio
device  virtio_pci
device  vtnet
device  virtio_blk
device  virtio_scsi
device  virtio_balloon
device  kvm_clock
device  hyperv
device  xenpci
device  netmap
device  evdev
device  uinput
device  hid

------------------------------------------------------------------------
ddb capture buffer

ddb: ddb_capture: kvm_nlist

-----
-----

128Gb Memory, AMD Ryzen Threadripper 1920X 12-core
6 running virtual machines (bhyve)
8 jails
quite a few vlans/bridges etc

triple0# zpool list -v
NAME                       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP 
DEDUP    HEALTH  ALTROOT
data_pool                 54.6T  32.7T  21.8T        -         -    13%    59% 
1.00x    ONLINE  -
  raidz2-0                54.6T  32.7T  21.8T        -         -    13%  60.0% 
    -    ONLINE
    gpt/SG601                 -      -      -        -         -      -      - 
    -    ONLINE
    gpt/SG602                 -      -      -        -         -      -      - 
    -    ONLINE
    gpt/SG603                 -      -      -        -         -      -      - 
    -    ONLINE
    gpt/SG609                 -      -      -        -         -      -      - 
    -    ONLINE
    gpt/SG605                 -      -      -        -         -      -      - 
    -    ONLINE
    gpt/SG606                 -      -      -        -         -      -      - 
    -    ONLINE
    gpt/SG607                 -      -      -        -         -      -      - 
    -    ONLINE
    gpt/SG608                 -      -      -        -         -      -      - 
    -    ONLINE
    gpt/T6_7                  -      -      -        -         -      -      - 
    -    ONLINE
    gpt/T6_10                 -      -      -        -         -      -      - 
    -    ONLINE
logs                          -      -      -        -         -      -      - 
    -  -
  gpt/zil01               31.5G  76.1M  31.4G        -         -     0%  0.23% 
    -    ONLINE
cache                         -      -      -        -         -      -      - 
    -  -
  gpt/data_cache           900G   328G   572G        -         -     0%  36.4% 
    -    ONLINE
  gpt/data_cache2          466G   464G  1.69G        -         -     0%  99.6% 
    -    ONLINE
pool_18a                  81.8T  67.6T  14.2T        -         -     1%    82% 
1.00x    ONLINE  -
  raidz1-0                81.8T  67.6T  14.2T        -         -     1%  82.6% 
    -    ONLINE
    gpt/T18_1                 -      -      -        -         -      -      - 
    -    ONLINE
    gpt/T18_3                 -      -      -        -         -      -      - 
    -    ONLINE
    gpt/T18_2                 -      -      -        -         -      -      - 
    -    ONLINE
    gpt/T18_4                 -      -      -        -         -      -      - 
    -    ONLINE
    gpt/T18_0                 -      -      -        -         -      -      - 
    -    ONLINE
cache                         -      -      -        -         -      -      - 
    -  -
  gpt/ssd120_R3C2          112G  3.39G   108G        -         -     0%  3.03% 
    -    ONLINE
system                     432G   194G   238G        -         -    47%    44% 
1.00x    ONLINE  -
  mirror-0                 432G   194G   238G        -         -    47%  44.9% 
    -    ONLINE
    gpt/system_A              -      -      -        -         -      -      - 
    -    ONLINE
    gpt/system_pool_R1C1      -      -      -        -         -      -      - 
    -    ONLINE

-- 
You are receiving this mail because:
You are the assignee for the bug.