From nobody Mon Jun 13 20:20:06 2022 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 7AE1F85E98E for ; Mon, 13 Jun 2022 20:20:09 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4LMNGd0ZyBz4Vdj for ; Mon, 13 Jun 2022 20:20:09 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id E9ECB1C53E for ; Mon, 13 Jun 2022 20:20:08 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 25DKK8G0071107 for ; Mon, 13 Jun 2022 20:20:08 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 25DKK8bc071106 for bugs@FreeBSD.org; Mon, 13 Jun 2022 20:20:08 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 253954] kernel: g_access(958): provider da8 has error 6 set Date: Mon, 13 Jun 2022 20:20:06 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 12.1-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: jnaughto@ee.ryerson.ca X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@freebsd.org MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1655151609; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SNf5taeGiS7oQn+0LmHs5GF3qK1UHaED2ds2MpCG048=; b=ChQaS/nMiPqOBimU8vD5a30B/mVSHFg9kiqDh8TJv1E1AXu5QoTriMMo65mRKTTToFaBcF X4AXyIXU1SXF116HPgPIekrVWZRxnXj+NLSRgHGkVJbfbyg9n1bPtfSOQNlhOuW8zoXgET meDisKAcQnaEORV2hL5a3BjrEZ9a9iLeRxRgt4eZ7eo4lKkK+iAoPmgGxapuxhfM9igfhi gqXOEaig0W5nm6misSN9dheuIcwU2KqhzqN8vwQ1ckHUnmaaDkUpt+zcpWEofFZsAyrNZ3 hFBCHa7tlMpnlcp4xi11igNkWfNnexu87aDMARR/momKxAcZFYC0ObBi2uCvUQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1655151609; a=rsa-sha256; cv=none; b=O6Pb7FG0LZSIe49J3OGabjQ+tl2qa6cWhdpQwB3qV0SmNAtHbyf4qNqy6aZuCp6XJSA6UU fXZRhNKa29kvCMW54+04btv5Yx/iN5EruFYjz2qjEqRlLLZTpnF8QX23vh0Md77olJmic+ yuWom1VxFc3HxQHgM/MUzodRZFMtNacSxovvueeHYxRrE9UgAOBph5tTb9uFiXMCbFmRrK B9J/sRLenAmdrsCpjpfoRqlD89IPEgTNcHQVdZkNdvPGQ51fzKfWV0xqbMxxHM6WhCN0w7 SOsmqCas4iZDnhgcHZmlAjaIZ/bKxD3EZd05zUrj/WNY3QLOktpmQfHJc2cFeA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D253954 jnaughto@ee.ryerson.ca changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jnaughto@ee.ryerson.ca --- Comment #4 from jnaughto@ee.ryerson.ca --- Any update on this bug. I just experienced the exact same issue. I have 8 disks (all SATA) connected to a Freebsd 12.3 system. The ZFS pool is setup= as a raidz3. Got in today found one drive was "REMOVED" # zpool status pool pool: pool state: DEGRADED status: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. scan: scrub repaired 0 in 0 days 02:32:26 with 0 errors on Sat Jun 11 05:32:26 2022 config: NAME STATE READ WRITE CKSUM pool DEGRADED 0 0 0 raidz3-0 DEGRADED 0 0 0 ada0 ONLINE 0 0 0 ada1 ONLINE 0 0 0 ada2 ONLINE 0 0 0 ada3 ONLINE 0 0 0 ada4 ONLINE 0 0 0 8936423309855741075 REMOVED 0 0 0 was /dev/ada5 ada6 ONLINE 0 0 0 ada7 ONLINE 0 0 0 I assumed that the drive had died and pulled it. I put a new drive in place and attempted to replace it: # zpool replace pool 8936423309855741075 ada5 cannot replace 8936423309855741075 with ada5: no such pool or dataset It seems that the old drive somehow is still remembered by the system. I d= ug through the logs to find the following occurring when the new drive is inse= rted into the system: Jun 13 13:03:15 server kernel: cam_periph_alloc: attempt to re-allocate val= id device ada5 rejected flags 0x118 refcount 1 Jun 13 13:03:15 server kernel: adaasync: Unable to attach to new device due= to status 0x6 Jun 13 13:04:23 server kernel: g_access(961): provider ada5 has error 6 set Did a reboot without the new drive in place. On reboot the output of the p= ool did look somewhat different: # zpool status pool pool: pool state: DEGRADED status: One or more devices could not be used because the label is missing = or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: http://illumos.org/msg/ZFS-8000-4J scan: scrub repaired 0 in 0 days 02:32:26 with 0 errors on Sat Jun 11 05:32:26 2022 config: NAME STATE READ WRITE CKSUM pool DEGRADED 0 0 0 raidz3-0 DEGRADED 0 0 0 ada0 ONLINE 0 0 0 ada1 ONLINE 0 0 0 ada2 ONLINE 0 0 0 ada3 ONLINE 0 0 0 ada4 ONLINE 0 0 0 8936423309855741075 FAULTED 0 0 0 was /dev/ada5 ada5 ONLINE 0 0 0 diskid/DISK-Z1W4HPXX ONLINE 0 0 0 errors: No known data errors I assumed this was due to the fact that there was one less drive attached a= nd the system assigned new adaX values to each drive. At this point when I inserted the new drive the new drive appeared as an ada9. So I re-issued t= he zpool replace command but now with ada9. Though it did take about 3mins be= fore the zpool replace command responded back (which really concerned me). Yet = the server has quite a few users accessing the filesystem so I thought as long = as the new drive was re-silvering I would be fine.... I do a weekly scrub of the pool and I believe the error crept up after the scub. at 11am today the logs showed the following response: Jun 13 11:29:15 172.16.20.66 kernel: (ada5:ahcich5:0:0:0): FLUSHCACHE48. AC= B: ea 00 00 00 00 40 00 00 00 00 00 00 Jun 13 11:29:15 172.16.20.66 kernel: (ada5:ahcich5:0:0:0): CAM status: Comm= and timeout Jun 13 11:29:15 172.16.20.66 kernel: (ada5:ahcich5:0:0:0): Retrying command= , 0 more tries remain Jun 13 11:30:35 172.16.20.66 kernel: ahcich5: Timeout on slot 5 port 0 Jun 13 11:30:35 172.16.20.66 kernel: ahcich5: is 00000000 cs 00000060 ss 00000000 rs 00000060 tfd c0 serr 00000000 cmd 0004c517 Jun 13 11:30:35 172.16.20.66 kernel: (ada5:ahcich5:0:0:0): FLUSHCACHE48. AC= B: ea 00 00 00 00 40 00 00 00 00 00 00 Jun 13 11:30:35 172.16.20.66 kernel: (ada5:ahcich5:0:0:0): CAM status: Comm= and timeout Jun 13 11:30:35 172.16.20.66 kernel: (ada5:ahcich5:0:0:0): Retrying command= , 0 more tries remain Jun 13 11:31:08 172.16.20.66 kernel: ahcich5: AHCI reset: device not ready after 31000ms (tfd =3D 00000080) At 11:39 I believe the following log entries are of note: Jun 13 11:39:45 172.16.20.66 kernel: (ada5:ahcich5:0:0:0): CAM status: Unconditionally Re-queue Request Jun 13 11:39:45 172.16.20.66 kernel: (ada5:ahcich5:0:0:0): Error 5, Periph = was invalidated Jun 13 11:39:45 172.16.20.66 ZFS[92964]: vdev state changed, pool_guid=3D$5100646062824685774 vdev_guid=3D$8936423309855741075 Jun 13 11:39:45 172.16.20.66 ZFS[92966]: vdev is removed, pool_guid=3D$5100646062824685774 vdev_guid=3D$8936423309855741075 Jun 13 11:39:46 172.16.20.66 kernel: g_access(961): provider ada5 has error= 6 set Jun 13 11:39:47 reactor syslogd: last message repeated 1 times Jun 13 11:39:47 172.16.20.66 syslogd: last message repeated 1 times Jun 13 11:39:47 172.16.20.66 kernel: ZFS WARNING: Unable to attach to ada5. Any idea on what was the issue? --=20 You are receiving this mail because: You are the assignee for the bug.=