From nobody Tue Jul 20 15:43:17 2021 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id B07241271274 for ; Tue, 20 Jul 2021 15:43:17 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4GTjfY3QrRz4p09 for ; Tue, 20 Jul 2021 15:43:17 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 5D5F156A3 for ; Tue, 20 Jul 2021 15:43:17 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 16KFhHYK089165 for ; Tue, 20 Jul 2021 15:43:17 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 16KFhHWo089164 for bugs@FreeBSD.org; Tue, 20 Jul 2021 15:43:17 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 257298] kernel panic with kern.cam.da.enable_uma_ccbs=1 Date: Tue, 20 Jul 2021 15:43:17 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: pr@aoek.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@freebsd.org MIME-Version: 1.0 X-ThisMailContainsUnwantedMimeParts: N https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D257298 Bug ID: 257298 Summary: kernel panic with kern.cam.da.enable_uma_ccbs=3D1 Product: Base System Version: CURRENT Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: pr@aoek.com Hi, despite https://reviews.freebsd.org/D31054 I can still reproduce the same e= xact bt as reported in the mailing list https://lists.freebsd.org/archives/freebsd-current/2021-June/000267.html In particular, with a GENERIC kernel I get: panic: Duplicate free of 0xffffa02039d7a000 from zone 0xffff000166aec000(ada_ccb) slab 0xffffa02039d7afd8(0)=20=20=20=20=20=20=20= =20=20=20 cpuid =3D 10=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 time =3D 1626781044=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 KDB: stack backtrace:=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 db_trace_self() at db_trace_self=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20 db_trace_self_wrapper() at db_trace_self_wrapper+0x30=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 vpanic() at vpanic+0x188=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20 panic() at panic+0x44=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 uma_dbg_free() at uma_dbg_free+0x1e4=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20 uma_zfree_arg() at uma_zfree_arg+0x358=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20 ahci_end_transaction() at ahci_end_transaction+0x7a4 ahci_ch_intr_main() at ahci_ch_intr_main+0x660 ahci_ch_intr() at ahci_ch_intr+0x5c ahci_intr() at ahci_intr+0xe4 ithread_loop() at ithread_loop+0x2a8 fork_exit() at fork_exit+0x74 fork_trampoline() at fork_trampoline+0x14 KDB: enter: panic [ thread pid 12 tid 100137 ] Stopped at kdb_enter+0x48: undefined f904411f And with a GENERIC-NODEBUG kernel I get: panic: vm_fault failed: ffff0000007a9b40 error 1 cpuid =3D 2 time =3D 1626773972 KDB: stack backtrace: db_trace_self() at db_trace_self db_trace_self_wrapper() at db_trace_self_wrapper+0x30 vpanic() at vpanic+0x188 panic() at panic+0x44 data_abort() at data_abort+0x1e0 handle_el1h_sync() at handle_el1h_sync+0x74 --- exception, esr 0x96000044 zone_release() at zone_release+0x224 bucket_drain() at bucket_drain+0xe8 bucket_cache_reclaim_domain() at bucket_cache_reclaim_domain+0x3b0 zone_reclaim() at zone_reclaim+0x194 uma_reclaim_domain() at uma_reclaim_domain+0xbc vm_pageout_worker() at vm_pageout_worker+0x594 vm_pageout() at vm_pageout+0x1e0 fork_exit() at fork_exit+0x94 fork_trampoline() at fork_trampoline+0x14 KDB: enter: panic [ thread pid 33 tid 100222 ] Stopped at kdb_enter+0x48: undefined f903c11f This is with CURRENT as of 439097486ba0453e057c05d548fa306d91c784e5 Author: Jessica Clarke Date: Mon Jul 19 17:19:23 2021 +0100 (This is just where I am now, nothing to do with Jessica commit). Environment: # uname -a FreeBSD asn 14.0-CURRENT FreeBSD 14.0-CURRENT #0 main-n248066-439097486ba0:= Mon Jul 19 21:33:35 CEST 2021=20=20=20=20 root@asn:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC-NODEBUG arm64 (or GENERIC instead than GENERIC-NODEBUG) I have a board that is known to have low signal levels in the SATA subsystem and hits frequent minor troubles with ada disks, such as: (ada0:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 08 c0 ce 36 40 06 00 00 0= 0 00 00 (ada0:ahcich1:0:0:0): CAM status: Uncorrectable parity/CRC error (ada0:ahcich1:0:0:0): Retrying command, 3 more tries remain or ahcich1: Timeout on slot 14 port 0 ahcich1: is 00000000 cs 0003c080 ss 0003c080 rs 0003c080 tfd 50 serr 001800= 00 cmd 0000c017 (ada0:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 20 e8 8f ed 40 08 00 00 0= 0 00 00 (ada0:ahcich1:0:0:0): CAM status: Command timeout (ada0:ahcich1:0:0:0): Retrying command, 3 more tries remain This is ok, I mean, FreeBSD is solid enough to cope with that and get no da= ta loss at fs level. While I understand that this is sub optimal, the circumstance reveals the b= ug which is the object of this report: i.e. the kernel panics with faulty hardware. Interestingly I can avoid the bug by setting kern.cam.da.enable_uma_ccbs=3D0 Note that I set .da., not .ada sysctl. I have no da disks in the system, on= ly ada (two). # sysctl -a | fgrep cbs kern.cam.da.enable_uma_ccbs: 0 kern.cam.ada.enable_uma_ccbs: 0 I am unable to get a kernel dump with line numbers (RAM >> swap). Is there a workaround for this? Regarding the bug, I can test further, please suggest the direction. --=20 You are receiving this mail because: You are the assignee for the bug.=