[Bug 257042] mpr driver fails to wake "idle" disks on reboot

From: <bugzilla-noreply_at_freebsd.org>
Date: Wed, 07 Jul 2021 16:23:40 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257042

            Bug ID: 257042
           Summary: mpr driver fails to wake "idle" disks on reboot
           Product: Base System
           Version: 12.2-STABLE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: karl@denninger.net

12.2-STABLE (and previous 12.2, may be prior) 

Context:
mpr0: <Avago Technologies (LSI) SAS3008> port 0x6000-0x60ff mem
0xa2540000-0xa25
4ffff,0xa2500000-0xa253ffff irq 16 at device 0.0 on pci1
mpr0: Firmware: 16.00.01.00, Driver: 23.00.00.00-fbsd
mpr0: IOCCapabilities:
7a85c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,Even
tReplay,MSIXIndex,HostDisc,FastPath,RDPQArray>
pcib2: <ACPI PCI-PCI bridge> irq 16 at device 1.1 on pci0
pci2: <ACPI PCI bus> on pcib2
mpr1: <Avago Technologies (LSI) SAS3008> port 0x5000-0x50ff mem
0xa2340000-0xa23
4ffff,0xa2300000-0xa233ffff irq 17 at device 0.0 on pci2
mpr1: Firmware: 16.00.01.00, Driver: 23.00.00.00-fbsd
mpr1: IOCCapabilities:
7a85c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,Even
tReplay,MSIXIndex,HostDisc,FastPath,RDPQArray>

Several (rust) disks attached to this adapter; the boot pool is on the
motherboard ata ports.  Boot pool is comprised of SSDs and is not impacted.

The firmware is disabled in the card as I do not wish to use its onboard
facilities, nor boot from it. Starting from a power-off state the system boots
normally, spins up the disks, and you get something similar to this:

Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
mpr0: Found device <881<SataDev,Direct>,End Device> <6.0Gbps> handle<0x0009>
enclosureHandle<0x0001> slot 7
mpr0: At enclosure level 0 and connector name (    )
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
mpr1: Found device <881<SataDev,Direct>,End Device> <6.0Gbps> handle<0x0009>
enclosureHandle<0x0001> slot 0
mpr1: At enclosure level 0 and connector name (    )
mpr0: Found device <881<SataDev,Direct>,End Device> <6.0Gbps> handle<0x000a>
enclosureHandle<0x0001> slot 3
mpr0: At enclosure level 0 and connector name (    )
mpr0: Found device <881<SataDev,Direct>,End Device> <6.0Gbps> handle<0x000b>
enclosureHandle<0x0001> slot 2
mpr0: At enclosure level 0 and connector name (    )

.... and so on until all of the disks are enumerated, then the boot continues
and all is well.

At the end of start, in /etc/rc.local, I do this:

/sbin/camcontrol idle da0 -t 600
/sbin/camcontrol idle da1 -t 600
/sbin/camcontrol idle da2 -t 600
/sbin/camcontrol idle da3 -t 600
/sbin/camcontrol idle da4 -t 600
/sbin/camcontrol idle da5 -t 600
/sbin/camcontrol idle da6 -t 600
/sbin/camcontrol idle da7 -t 600

The intent is to set a 10 minute spin-down on those disks to conserve power
since these disks are all a "bulk storage" pool that has rare accesses to it
during normal operations. This works well while the system is running at the
expense of a brief time penalty if access is made to that pool after 10 or
minutes of inactivity in that the components of the pool must spin up.

The problem is that if I reboot the system (e.g. "reboot" from a root login,
such as when updating the kernel and/or world) without powering it off (or the
system panics and reboots) the driver *does not* wake these drives up.  It does
so from a cold start (apparently detecting the spun-down state and sending
"spin-up" commands on a cold start) but doesn't send the necessary command(s)
to wake disks from an idle state.  As such if any of these devices ARE idle
when a non-power-cycled reboot occurs the system spins on "Root mount waiting
for: CAM" forever.

This appears to be an omission in the FreeBSD driver code; the documentation
for camcontrol notes that a "sleep" may require a reset, but both "idle" and
"standby" should be woken by activity to the target device.

-- 
You are receiving this mail because:
You are the assignee for the bug.