From nobody Mon Mar 07 21:42:54 2022 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 5277919FD8FA; Mon, 7 Mar 2022 21:42:59 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qt1-x82d.google.com (mail-qt1-x82d.google.com [IPv6:2607:f8b0:4864:20::82d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4KCBlQ3Ssfz4T8b; Mon, 7 Mar 2022 21:42:58 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-qt1-x82d.google.com with SMTP id bc10so14527994qtb.5; Mon, 07 Mar 2022 13:42:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=keQ4L7UbBstjJ4Z8l3tC/GCwog+37U4yNm5EHWpkJyI=; b=YELCZYJVzFSkd+idx/YfA9Oyt9zrPW4l4hTnpgFxkIw0g4cVnTc+oOZHzT6pnVLh+0 P5j81luLKkOxJF+Chxwkps5FMEpukmrgyH1E7EFko6LndBrY4In2V41MAjp4Le87ukuy kIZa/BG7ERp8/gvjZcod7fElWLpR9TD8tUEQhqRztyN0eazVyGw41nEzJLhM9xdvF/vK jdBzwZbJSY2lDaOn7uoB0tFV1Njo3MpT4AWJPBpKdab/0TQ6R6068014AgBVtQ7jm0VR cVptbpymyg3SPPyqI4SVV46DTV2ueK2cF5u7anNPHOWCFRSStqT9mMKloEoBRz3L9Rzv qE+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=keQ4L7UbBstjJ4Z8l3tC/GCwog+37U4yNm5EHWpkJyI=; b=3oXnGZ0i137szOeQgHrQtLtVRt37WLf2oCIk1hydT7lMJP4Icn0oBynvMt407Axtk4 2hBCpAUoO9MWQgoGZ5oP315BxDO6KsmRIci9G9FKvXnVDXMg1lMVwC2ZGSSSvgGs6D3f uUhi5yCSIZxVkXQdAILs+2kbBt5jRHIoJQ19tnGzScxzOoL2gpbxBsHfK9yxdl2o/mxk C6naaDvrWBTQYIKaUaza2bJIAFNASr0pkXvQk3ScmvBGId3seW2mE5MPutArVaxDy6C1 fflz3iSvDwskDzpqHS667wQNiuFDzldaM/d70WYz4lLyzWcuefxf0zKOSCaF15YbM9yi 1COQ== X-Gm-Message-State: AOAM533IQcBZSn3/1bZ7P+QmRTnkBlCfKZzzDWzo/cfEogeUn4c1Yksp pSsQeSSYLtmTb5CNxj/gcrI= X-Google-Smtp-Source: ABdhPJyCwNnxkvBhwxel+dTah4K5okKROtOb+iOTIiXy+V8phQN1otM3q5nebqF5dgc2s0RLsfuFFg== X-Received: by 2002:ac8:5986:0:b0:2de:97e9:a517 with SMTP id e6-20020ac85986000000b002de97e9a517mr11100340qte.599.1646689377836; Mon, 07 Mar 2022 13:42:57 -0800 (PST) Received: from nuc (198-84-189-58.cpe.teksavvy.com. [198.84.189.58]) by smtp.gmail.com with ESMTPSA id x21-20020a05622a001500b002e064e63fc2sm3271935qtw.70.2022.03.07.13.42.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 07 Mar 2022 13:42:56 -0800 (PST) Date: Mon, 7 Mar 2022 16:42:54 -0500 From: Mark Johnston To: Ronald Klop Cc: bob prohaska , Mark Millard , freebsd-arm@freebsd.org, freebsd-current Subject: Re: panic: data abort in critical section or under mutex (was: Re: panic: Unknown kernel exception 0 esr_el1 2000000 (on 14-CURRENT/aarch64 Feb 28)) Message-ID: References: <1800459695.1.1646649539521@mailrelay> <132978150.92.1646660769467@mailrelay> <1302689164.173.1646686466515@mailrelay> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1302689164.173.1646686466515@mailrelay> X-Rspamd-Queue-Id: 4KCBlQ3Ssfz4T8b X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20210112 header.b=YELCZYJV; dmarc=none; spf=pass (mx1.freebsd.org: domain of markjdb@gmail.com designates 2607:f8b0:4864:20::82d as permitted sender) smtp.mailfrom=markjdb@gmail.com X-Spamd-Result: default: False [-2.70 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36:c]; RCPT_COUNT_FIVE(0.00)[5]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[gmail.com:+]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FORGED_SENDER(0.30)[markj@freebsd.org,markjdb@gmail.com]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[markj@freebsd.org,markjdb@gmail.com]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20210112]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::82d:from]; MLMMJ_DEST(0.00)[freebsd-arm,freebsd-current]; MID_RHS_NOT_FQDN(0.50)[]; FREEMAIL_CC(0.00)[www.zefox.net,yahoo.com,freebsd.org]; RCVD_TLS_ALL(0.00)[] X-ThisMailContainsUnwantedMimeParts: N On Mon, Mar 07, 2022 at 09:54:26PM +0100, Ronald Klop wrote: > > Van: Mark Johnston > Datum: maandag, 7 maart 2022 16:13 > Aan: Ronald Klop > CC: bob prohaska , Mark Millard , freebsd-arm@freebsd.org, freebsd-current > > I haven't been able to reproduce any crashes running poudriere in an > > arm64 AWS instance, though. Could you please try the patch below and > > confirm whether it fixes your panics? I verified that the apparent > > problem described above is gone with the patch. > > > > diff --git a/sys/kern/kern_rmlock.c b/sys/kern/kern_rmlock.c > > index 0cdcfb8fec62..e51c25136ae0 100644 > > --- a/sys/kern/kern_rmlock.c > > +++ b/sys/kern/kern_rmlock.c > > @@ -437,6 +437,7 @@ _rm_rlock(struct rmlock *rm, struct rm_priotracker *tracker, int trylock) > > { > > struct thread *td = curthread; > > struct pcpu *pc; > > + int cpuid; > > > > if (SCHEDULER_STOPPED()) > > return (1); > > @@ -452,6 +453,7 @@ _rm_rlock(struct rmlock *rm, struct rm_priotracker *tracker, int trylock) > > atomic_interrupt_fence(); > > > > pc = get_pcpu(); > > + cpuid = pc->pc_cpuid; > > rm_tracker_add(pc, tracker); > > sched_pin(); > > > > @@ -463,7 +465,7 @@ _rm_rlock(struct rmlock *rm, struct rm_priotracker *tracker, int trylock) > > * conditional jump. > > */ > > if (__predict_true(0 == (td->td_owepreempt | > > - CPU_ISSET(pc->pc_cpuid, &rm->rm_writecpus)))) > > + CPU_ISSET(cpuid, &rm->rm_writecpus)))) > > return (1); > > > > /* We do not have a read token and need to acquire one. */ > > > > > > > > Hi, > > This patch paniced again: > x0: ffffa00005a31500 > x1: ffffa00005a0e000 > x2: 2 > x3: ffffa00076c4e9a0 > x4: 0 > x5: e672743c8f9e5 > x6: dc89f70500ab1 > x7: 14 > x8: ffffa00005a31518 > x9: 1 > x10: ffffa00005a0e000 > x11: 0 > x12: 0 > x13: a > x14: 1013e6b85a8ecbe4 > x15: 1dce740d11a5 > x16: ffff3ea86e2434bf > x17: fffffffffffffff2 > x18: ffff0000fe661800 (g_ctx + fcf9fa54) > x19: ffffa00076c4e9a0 > x20: ffff0000fec39000 (g_ctx + fd577254) > x21: 2 > x22: ffff0000419b6090 (g_ctx + 402f42e4) > x23: ffff000000c0b137 (lockstat_enabled + 0) > x24: 100 > x25: ffff000000c0b000 (version + a0) > x26: ffff000000c0b000 (version + a0) > x27: ffff000000c0b000 (version + a0) > x28: 0 > x29: ffff0000fe661800 (g_ctx + fcf9fa54) > sp: ffff0000fe661800 > lr: ffff00000154ea50 (zio_dva_throttle + 154) > elr: ffff00000154ea80 (zio_dva_throttle + 184) > spsr: 60000045 > far: 2b753286b0b8 > panic: Unknown kernel exception 0 esr_el1 2000000 > cpuid = 1 > time = 1646685857 > KDB: stack backtrace: > db_trace_self() at db_trace_self > db_trace_self_wrapper() at db_trace_self_wrapper+0x30 > vpanic() at vpanic+0x174 > panic() at panic+0x44 > do_el1h_sync() at do_el1h_sync+0x184 > handle_el1h_sync() at handle_el1h_sync+0x10 > --- exception, esr 0x2000000 > zio_dva_throttle() at zio_dva_throttle+0x184 > zio_execute() at zio_execute+0x58 > KDB: enter: panic > [ thread pid 0 tid 100129 ] > Stopped at kdb_enter+0x44: undefined f901c11f > db> ZFS doesn't make use of rm locks as far as I can see, so this is a little weird. I reverted the original rmlock commit in main, so it may be worth verifying that the problem really is gone before digging deeper. In other words, I'm a bit suspicious that this is a different bug.