From nobody Sun Mar 06 22:22:42 2022 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 547DE19EBC14 for ; Sun, 6 Mar 2022 22:22:51 +0000 (UTC) (envelope-from SRS0=EhES=TR=klop.ws=ronald-lists@realworks.nl) Received: from smtp-relay-int.realworks.nl (smtp-relay-int.realworks.nl [194.109.157.24]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4KBbgt2dpsz3l4j for ; Sun, 6 Mar 2022 22:22:50 +0000 (UTC) (envelope-from SRS0=EhES=TR=klop.ws=ronald-lists@realworks.nl) Date: Sun, 6 Mar 2022 23:22:42 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=klop.ws; s=rw2; t=1646605363; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=LoTT3mC2lSgthi3owYipU1aPDBI6CdXSq5PPa9+87r4=; b=de4ZWNtWYh+mF6T2y4AU+P09Rmpgunwu574wRFkzBjP9y7HCuUfFsJeDWyqn/V9HqLn8FQ SjZWmWFokMCxWs2VTzRd9Ysc7aAKacMiOFF7ZnJCFtrRhBQZoFEwMA33nh0KT02oSvi6RA XjsWLmoQ8cJR/ZLjin8wAwy4icZsaFFA0xlgKcRkK9NbGbAY1SEJHUl+Nm1uicWPiCKBhf TXdEtqCvnl/zeHrW3dgaICaiaw14xS9jZDfDLmpRCSl+BygULZaKkvhPL7giim4noiaJN8 BTMakfQNi8BQ8gcCKPq0/fmi7Y3zBGj2QBL76ghPPs2kNVOs3THppN1QW95b9Q== From: Ronald Klop To: FreeBSD Current Message-ID: <710436463.96.1646605362794@mailrelay> In-Reply-To: <989767310.86.1646492957581@mailrelay> References: <1716388080.66.1646478964882@mailrelay> <989767310.86.1646492957581@mailrelay> Subject: Re: panic: data abort in critical section or under mutex (was: Re: panic: Unknown kernel exception 0 esr_el1 2000000 (on 14-CURRENT/aarch64 Feb 28)) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_95_599078998.1646605362707" X-Mailer: Realworks (599.209.9d862f3) Importance: Normal X-Priority: 3 (Normal) X-Rspamd-Queue-Id: 4KBbgt2dpsz3l4j X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=klop.ws header.s=rw2 header.b=de4ZWNtW; dmarc=pass (policy=quarantine) header.from=klop.ws; spf=pass (mx1.freebsd.org: domain of "SRS0=EhES=TR=klop.ws=ronald-lists@realworks.nl" designates 194.109.157.24 as permitted sender) smtp.mailfrom="SRS0=EhES=TR=klop.ws=ronald-lists@realworks.nl" X-Spamd-Result: default: False [-3.20 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.999]; R_DKIM_ALLOW(-0.20)[klop.ws:s=rw2]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:194.109.157.0/24]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; NEURAL_HAM_LONG(-1.00)[-1.000]; RCPT_COUNT_ONE(0.00)[1]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[klop.ws:+]; DMARC_POLICY_ALLOW(-0.50)[klop.ws,quarantine]; HAS_X_PRIO_THREE(0.00)[3]; NEURAL_HAM_SHORT(-1.00)[-0.998]; MLMMJ_DEST(0.00)[freebsd-current]; FORGED_SENDER(0.30)[ronald-lists@klop.ws,SRS0=EhES=TR=klop.ws=ronald-lists@realworks.nl]; RCVD_COUNT_ZERO(0.00)[0]; MIME_TRACE(0.00)[0:+,1:+,2:~]; MID_RHS_NOT_FQDN(0.50)[]; ASN(0.00)[asn:3265, ipnet:194.109.0.0/16, country:NL]; FROM_NEQ_ENVFROM(0.00)[ronald-lists@klop.ws,SRS0=EhES=TR=klop.ws=ronald-lists@realworks.nl] X-ThisMailContainsUnwantedMimeParts: N ------=_Part_95_599078998.1646605362707 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Hi, Did some binary search with kernels from artifact.ci.freebsd.org. I suspect "rmlock: Micro-optimize read locking" as cause. https://cgit.freebsd.org/src/commit/?id=c84bb8cd771ce4bed58152e47a32dda470bef23a And "rmlock: Add required compiler barriers to _rm_runlock()" as solution. https://cgit.freebsd.org/src/commit/?id=89ae8eb74e87ac19aa2d7abe4ba16bcccd32bb9f So I probably just had a bad day. Regards, Ronald. Van: Ronald Klop Datum: zaterdag, 5 maart 2022 16:09 Aan: FreeBSD Current Onderwerp: panic: data abort in critical section or under mutex (was: Re: panic: Unknown kernel exception 0 esr_el1 2000000 (on 14-CURRENT/aarch64 Feb 28)) > > Hi, > > Another panic while building world/kernel. Different panic message and trace. > > x0: 1f5e152c32cc > x1: ffff0000b630a000 (g_ctx + b4c4a254) > x2: 1 > x3: 2e > x4: ffffa000bb46d600 > x5: 0 > x6: 0 x7: ffff000000f05104 (has_pan + 0) > x8: 1 > x9: 809c227c > x10: bd > x11: 40 > x12: 0 > x13: 1 > x14: 1782f000 > x15: 1001 > x16: 1782f003 > x17: 1f5e957392f0 > x18: ffff00010719e630 (next_index + 2cac528) > x19: ffff00010719e768 (next_index + 2cac660) > x20: 1 > x21: ffff0000b630a000 (g_ctx + b4c4a254) > x22: 1 > x23: ffffffbf > x24: ffff00010719e758 (next_index + 2cac650) > x25: ffffa00026cdd160 > x26: 1 > x27: ffffa000bb46d600 > x28: ffff00000092815a (do_execve.fexecv_proc_title + 5483) > x29: ffff00010719e630 (next_index + 2cac528) > sp: ffff00010719e630 > lr: ffff00000053e890 (uiomove_faultflag + 128) > elr: ffff000000804f80 (byte_by_byte + 4) > spsr: 45 > far: ffff0000b630a000 (g_ctx + b4c4a254) > esr: 96000047 > panic: data abort in critical section or under mutex > cpuid = 2 > time = 1646489189 > KDB: stack backtrace: > db_trace_self() at db_trace_self > db_trace_self_wrapper() at db_trace_self_wrapper+0x30 > vpanic() at vpanic+0x174 > panic() at panic+0x44 > data_abort() at data_abort+0x2a8 > handle_el1h_sync() at handle_el1h_sync+0x10 > --- exception, esr 0x96000047 > byte_by_byte() at byte_by_byte+0x4 > pipe_write() at pipe_write+0x668 > KDB: enter: panic > [ thread pid 68336 tid 100593 ] > Stopped at kdb_enter+0x44: undefined f901c11f > db> > > > > Regards, > Ronald. > > > > Van: Ronald Klop > Datum: zaterdag, 5 maart 2022 12:16 > Aan: FreeBSD Current > Onderwerp: panic: Unknown kernel exception 0 esr_el1 2000000 (on 14-CURRENT/aarch64 Feb 28) >> >> Hi, >> >> Repeated panics on 14-CURRENT/aarch64. This happens e.g. when the nigthly backup is started. >> # uname -a >> FreeBSD rpi4 14.0-CURRENT FreeBSD 14.0-CURRENT #22 main-5f702d6d9a: Mon Feb 28 06:12:48 CET 2022 ronald@rpi4:/home/ronald/dev/obj/home/ronald/dev/freebsd/arm64.aarch64/sys/GENERIC-NODEBUG arm64 >> >> It was stable with all kernels until (and including) "FreeBSD 14.0-CURRENT #21 main-e11ad014d1-dirty: Sat Feb 5 00:09:08 CET 2022". >> >> It runs ZFS-on-root via an USB disk. No other FS involved. >> # gpart show >> => 40 1953458096 da0 GPT (931G) >> 40 102400 1 efi (50M) >> 102440 8388608 2 freebsd-swap (4.0G) >> 8491048 1944967088 3 freebsd-zfs (927G) >> >> >> Output on serial console: >> x0: ffffa000059c1380 >> x1: ffffa000059b1600 >> x2: 3 >> x3: ffffa001862779a0 >> x4: 0 >> x5: 9438238792a1a >> x6: d217e9df58308 >> x7: 14 >> x8: ffffa000059c1398 >> x9: 1 >> x10: ffffa000059b1600 >> x11: 2 >> x12: 1 >> x13: f2557a42c5b0f240 >> x14: 1013e6b85a8ecbe4 >> x15: 24f981889f30 >> x16: ffff4afedeb89cb8 >> x17: fffffffffffffff2 >> x18: ffff0000fe666800 (g_ctx + fcfa6a54) >> x19: 0 >> x20: ffff0000fec41000 (g_ctx + fd581254) >> x21: 3 >> x22: ffff0000419bb090 (g_ctx + 402fb2e4) >> x23: ffff000000c09bb7 (lockstat_enabled + 0) >> x24: 180 >> x25: ffff000000c09000 (sdt_vfs_vop_vop_spare1_entry + 28) >> x26: ffff000000c09000 (sdt_vfs_vop_vop_spare1_entry + 28) >> x27: ffff000000c09000 (sdt_vfs_vop_vop_spare1_entry + 28) >> x28: 0 >> x29: ffff0000fe666800 (g_ctx + fcfa6a54) >> sp: ffff0000fe666800 >> lr: ffff00000154ca38 (zio_dva_throttle + 13c) >> elr: ffff00000154ca80 (zio_dva_throttle + 184) >> spsr: 20000045 >> far: 1f24979f8000 >> panic: Unknown kernel exception 0 esr_el1 2000000 >> cpuid = 2 >> time = 1646433952 >> KDB: stack backtrace: >> db_trace_self() at db_trace_self >> db_trace_self_wrapper() at db_trace_self_wrapper+0x30 >> vpanic() at vpanic+0x174 >> panic() at panic+0x44 >> do_el1h_sync() at do_el1h_sync+0x184 >> handle_el1h_sync() at handle_el1h_sync+0x10 >> --- exception, esr 0x2000000 >> zio_dva_throttle() at zio_dva_throttle+0x184 >> zio_execute() at zio_execute+0x58 >> KDB: enter: panic >> [ thread pid 0 tid 100128 ] >> Stopped at kdb_enter+0x44: undefined f901c11f >> db> >> >> >> I'm going to build a newer kernel to see if the problem persists. I can keep the current kernel to reproduce this if needed. >> >> Regards, >> Ronald. > ------=_Part_95_599078998.1646605362707 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit Hi,

Did some binary search with kernels from artifact.ci.freebsd.org.

I suspect "rmlock: Micro-optimize read locking" as cause.
https://cgit.freebsd.org/src/commit/?id=c84bb8cd771ce4bed58152e47a32dda470bef23a

And "rmlock: Add required compiler barriers to _rm_runlock()" as solution.
https://cgit.freebsd.org/src/commit/?id=89ae8eb74e87ac19aa2d7abe4ba16bcccd32bb9f

So I probably just had a bad day.

Regards,
Ronald.

 

Van: Ronald Klop <ronald-lists@klop.ws>
Datum: zaterdag, 5 maart 2022 16:09
Aan: FreeBSD Current <freebsd-current@freebsd.org>
Onderwerp: panic: data abort in critical section or under mutex (was: Re: panic: Unknown kernel exception 0 esr_el1 2000000 (on 14-CURRENT/aarch64 Feb 28))

Hi,

Another panic while building world/kernel. Different panic message and trace.
 
  x0:     1f5e152c32cc                                                                                                       
  x1: ffff0000b630a000 (g_ctx + b4c4a254)                                                                                           
  x2:                1                                                                                                              
  x3:               2e                                                                                                              
  x4: ffffa000bb46d600                                                                                                              
  x5:                0                                                                                                              
  x6:                0  x7: ffff000000f05104 (has_pan + 0)
  x8:                1
  x9:         809c227c
 x10:               bd
 x11:               40
 x12:                0
 x13:                1
 x14:         1782f000
 x15:             1001
 x16:         1782f003
 x17:     1f5e957392f0
 x18: ffff00010719e630 (next_index + 2cac528)
 x19: ffff00010719e768 (next_index + 2cac660)
 x20:                1
 x21: ffff0000b630a000 (g_ctx + b4c4a254)
 x22:                1
 x23:         ffffffbf
 x24: ffff00010719e758 (next_index + 2cac650)
 x25: ffffa00026cdd160
 x26:                1
 x27: ffffa000bb46d600
 x28: ffff00000092815a (do_execve.fexecv_proc_title + 5483)
 x29: ffff00010719e630 (next_index + 2cac528)
  sp: ffff00010719e630
  lr: ffff00000053e890 (uiomove_faultflag + 128)
 elr: ffff000000804f80 (byte_by_byte + 4)
spsr:               45
 far: ffff0000b630a000 (g_ctx + b4c4a254)
 esr:         96000047
panic: data abort in critical section or under mutex
cpuid = 2
time = 1646489189
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x174
panic() at panic+0x44
data_abort() at data_abort+0x2a8
handle_el1h_sync() at handle_el1h_sync+0x10
--- exception, esr 0x96000047
byte_by_byte() at byte_by_byte+0x4
pipe_write() at pipe_write+0x668
KDB: enter: panic
[ thread pid 68336 tid 100593 ]
Stopped at      kdb_enter+0x44: undefined       f901c11f
db>


Regards,
Ronald.


 

Van: Ronald Klop <ronald-lists@klop.ws>
Datum: zaterdag, 5 maart 2022 12:16
Aan: FreeBSD Current <freebsd-current@freebsd.org>
Onderwerp: panic: Unknown kernel exception 0 esr_el1 2000000 (on 14-CURRENT/aarch64 Feb 28)

Hi,

Repeated panics on 14-CURRENT/aarch64. This happens e.g. when the nigthly backup is started.
# uname -a
FreeBSD rpi4 14.0-CURRENT FreeBSD 14.0-CURRENT #22 main-5f702d6d9a: Mon Feb 28 06:12:48 CET 2022     ronald@rpi4:/home/ronald/dev/obj/home/ronald/dev/freebsd/arm64.aarch64/sys/GENERIC-NODEBUG arm64

It was stable with all kernels until (and including) "FreeBSD 14.0-CURRENT #21 main-e11ad014d1-dirty: Sat Feb  5 00:09:08 CET 2022".

It runs ZFS-on-root via an USB disk. No other FS involved.
# gpart show
=>        40  1953458096  da0  GPT  (931G)
          40      102400    1  efi  (50M)
      102440     8388608    2  freebsd-swap  (4.0G)
     8491048  1944967088    3  freebsd-zfs  (927G)


Output on serial console:
x0: ffffa000059c1380                                                                                                       
  x1: ffffa000059b1600                                                                                                              
  x2:                3                                                                                                              
  x3: ffffa001862779a0                                                                                                              
  x4:                0        
  x5:    9438238792a1a
  x6:    d217e9df58308
  x7:               14
  x8: ffffa000059c1398
  x9:                1
 x10: ffffa000059b1600
 x11:                2
 x12:                1
 x13: f2557a42c5b0f240
 x14: 1013e6b85a8ecbe4
 x15:     24f981889f30
 x16: ffff4afedeb89cb8
 x17: fffffffffffffff2
 x18: ffff0000fe666800 (g_ctx + fcfa6a54)
 x19:                0
 x20: ffff0000fec41000 (g_ctx + fd581254)
 x21:                3
 x22: ffff0000419bb090 (g_ctx + 402fb2e4)
 x23: ffff000000c09bb7 (lockstat_enabled + 0)
 x24:              180
 x25: ffff000000c09000 (sdt_vfs_vop_vop_spare1_entry + 28)
 x26: ffff000000c09000 (sdt_vfs_vop_vop_spare1_entry + 28)
 x27: ffff000000c09000 (sdt_vfs_vop_vop_spare1_entry + 28)
 x28:                0
 x29: ffff0000fe666800 (g_ctx + fcfa6a54)
  sp: ffff0000fe666800
  lr: ffff00000154ca38 (zio_dva_throttle + 13c)
 elr: ffff00000154ca80 (zio_dva_throttle + 184)
spsr:         20000045
 far:     1f24979f8000
panic: Unknown kernel exception 0 esr_el1 2000000
cpuid = 2
time = 1646433952
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x174
panic() at panic+0x44
do_el1h_sync() at do_el1h_sync+0x184
handle_el1h_sync() at handle_el1h_sync+0x10
--- exception, esr 0x2000000
zio_dva_throttle() at zio_dva_throttle+0x184
zio_execute() at zio_execute+0x58
KDB: enter: panic
[ thread pid 0 tid 100128 ]
Stopped at      kdb_enter+0x44: undefined       f901c11f
db>

I'm going to build a newer kernel to see if the problem persists. I can keep the current kernel to reproduce this if needed.

Regards,
Ronald.
------=_Part_95_599078998.1646605362707--