Onboard RAID panic / reboot after CAM timeout?
Karl Pielorz
kpielorz_lst at tdx.co.uk
Mon Aug 12 13:07:07 UTC 2013
Hi,
I've got a amd64 '9.1-STABLE' box running with the systems 'onboard' RAID,
i.e.
ahci0: <Intel ICH8 AHCI SATA controller> port
0xf070-0xf077,0xf060-0xf063,0xf050-0xf057,0xf040-0xf043,0xf000-0xf01f mem
0xdfa22000-0xdfa227ff irq 19 at device 31.2 on pci0
ahci0: AHCI v1.30 with 6 3Gbps ports, Port Multiplier not supported
This is setup, and has been running fine:
Name Status Components
raid/r0 OPTIMAL ada0 (ACTIVE (ACTIVE))
ada1 (ACTIVE (ACTIVE))
The other day the machine picked up a CAM timeout, and rebooted:
"
ahcich1: Timeout on slot 31 port 0
ahcich1: is 00000000 cs 00000000 ss 80000000 rs 80000000 tfd 40 serr
00000000 cmd 0004df17
(ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 c0 4a e9 40 03 00 00
00 00 00
(ada1:ahcich1:0:0:0): CAM status: Command timeout
(ada1:ahcich1:0:0:0): Retrying command
"
By the time we'd gotten onto the box it had restarted, and had started
rebuilding the RAID array. This completed OK - and it has been OK since.
Presumably RAID should have either recovered/handled this, or at least just
failed ada1 and continued?
Are there any known issues with CAM timeouts on graid'ed drives not being
survivable?
Cheers,
-Karl
More information about the freebsd-geom
mailing list