[Bug 279978] After commit 25375b1415, any errors in device connected to ahci etc. results in Unretryable error
Date: Tue, 25 Jun 2024 04:04:06 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=279978
Bug ID: 279978
Summary: After commit 25375b1415, any errors in device
connected to ahci etc. results in Unretryable error
Product: Base System
Version: 14.1-RELEASE
Hardware: Any
OS: Any
Status: New
Severity: Affects Some People
Priority: ---
Component: kern
Assignee: bugs@FreeBSD.org
Reporter: aono@cc.osaka-kyoiku.ac.jp
I have a (half-broken) HDD (ada2, connected to ahci1) with a FreeBSD-14.1 (p0)
server in my office.
> kernel: CPU: Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz (2672.84-MHz K8-class CPU)
> kernel: Origin="GenuineIntel" Id=0x106a4 Family=0x6 Model=0x1a Stepping=4
> kernel: ahci1: <Intel ICH10 AHCI SATA controller> port 0x7c00-0x7c07,0x7880-0x7883,0x7800-0x7807,0x7480-0x7483,0x7400-0x741f mem 0xf7ffc000-0xf7ffc7ff irq 20 at device 31.2 on pci0
> kernel: ahci1: AHCI v1.20 with 6 3Gbps ports, Port Multiplier supported
> kernel: ahcich4: <AHCI channel> at channel 2 on ahci1
> kernel: ahciem0: <AHCI enclosure management bridge> on ahci1
> kernel: ses0 at ahciem0 bus 0 scbus9 target 0 lun 0
> kernel: ses0: <AHCI SGPIO Enclosure 2.00 0001> SEMB S-E-S 2.00 device
> kernel: ses0: SEMB SES Device
> kernel: ses0: ada2,pass2 in 'Slot 02', SATA Slot: scbus5 target 0
> kernel: ada2 at ahcich4 bus 0 scbus5 target 0 lun 0
> kernel: ada2: <WDC WD60EFRX-68L0BN1 82.00A82> ACS-2 ATA SATA 3.x device
> kernel: ada2: Serial Number WD-WX41DA5LVRR4
> kernel: ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
> kernel: ada2: Command Queueing enabled
> kernel: ada2: 5723166MB (11721045168 512 byte sectors)
> kernel: ada2: quirks=0x1<4K>
When running read/write bad sector using dd (with 'sysctl
kern.geom.debugflags=16'),
Unretryable error occurs and cannot access to ada2 until I use
'camcontrol reset ada2'.
> kernel: (ada2:ahcich4:0:0:0): READ_FPDMA_QUEUED. ACB: 60 01 68 da 57 40 b3 00 00 00 00 00
> kernel: (ada2:ahcich4:0:0:0): CAM status: Auto-Sense Retrieval Failed
> kernel: (ada2:ahcich4:0:0:0): Error 5, Unretryable error
When on FreeBSD-13.x, this error is retryable. (Following entries are
past logs, sector/ACB differs.)
> kernel: (ada2:ahcich4:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 e8 1b df 40 1f 01 00 08 00 00
> kernel: (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> kernel: (ada2:ahcich4:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> kernel: (ada2:ahcich4:0:0:0): RES: 41 40 b0 1c df 00 1f 01 00 00 00
> kernel: (ada2:ahcich4:0:0:0): Retrying command, 3 more tries remain
In commit 25375b1415, we changed as follows (/sys/dev/ahci/ahci.c only,
probably this also affects to siis/mvs):
diff --git a/sys/dev/ahci/ahci.c b/sys/dev/ahci/ahci.c
index 12e6ee8102da..d62a043eb2ab 100644
--- a/sys/dev/ahci/ahci.c
+++ b/sys/dev/ahci/ahci.c
@@ -2178,7 +2178,8 @@ completeall:
ahci_reset(ch);
return;
}
- ccb->ccb_h = ch->hold[i]->ccb_h; /* Reuse old header. */
+ xpt_setup_ccb(&ccb->ccb_h, ch->hold[i]->ccb_h.path,
+ ch->hold[i]->ccb_h.pinfo.priority);
if (ccb->ccb_h.func_code == XPT_ATA_IO) {
/* READ LOG */
ccb->ccb_h.recovery_type = RECOVERY_READ_LOG;
Commit message say 'only field I see used from all the header is target_id.'
But we need func_code in 'if' statement in NEXT line.
func_code is always same value (probably 0), so 'if' statement
never match condition (XPT_ATA_IO in above code), we always do
'REQUEST SENSE' in 'else' block. This is problematic.
Copying more CCB header (at least func_code) or 'if' condition change
(ex. 'if(ch->hold[i]->ccb.h.func_code == XPT_ATA_IO) { ...')
would solve this issue. I modified adding xpt_merge_ccb()
after xpt_setup_ccb() (booting with modified kernel seems to work fine),
but I'm not sure if this is a right code.
--
You are receiving this mail because:
You are the assignee for the bug.