Re: RFC: GEOM and hard disk LEDs

From: Andrey Fesenko <f0andrey_at_gmail.com>
Date: Fri, 17 Feb 2023 17:03:49 UTC
On Wed, Feb 8, 2023 at 2:31 AM Alan Somers <asomers@freebsd.org> wrote:
>
> Most modern SES backplanes have two LEDs per hard disk.  There's a
> "fault" LED and a "locate" LED.  You can control either one with
> sesutil(8) or, with a little more work, sg_ses from
> sysutils/sg3_utils.  They're very handy for tasks like replacing a
> failed disk, especially in large enclosures.  However, there isn't any
> way to automatically control them.  It would be very convenient if,
> for example, zfsd(8) could do it.  Basically, it would just set the
> fault LED for any disk that has been kicked out of a ZFS pool, and
> clear it for any disk that is healthy or is being resilvered.  But
> zfsd does not do that.  Instead, users' only options are to write a
> custom daemon or to use sesutil by hand.  Instead of forcing all of us
> to write our own custom daemons, why not train zfsd to do it?
>
> My proposal is to add boolean GEOM attributes for "fault" and
> "locate".  A userspace program would be able to look up their values
> for any geom with DIOCGATTR.  Setting them would require a new ioctl
> (DIOCSATTR?).  The disk class would issue a ENCIOC_SETELMSTAT to
> actually change the LEDs whenever this attribute changes.  GEOM
> transforms such as geli would simply pass the attribute through to
> lower layers.  Many-to-one transforms like gmultipath would pass the
> attribute through to all lower layers.  zfsd could then set all vdevs'
> fault attributes when it starts up, and adjust individual disk's as
> appropriate on an event-driven basis.
>
> Questions:
>
> * Are there any obvious flaws in this plan, any reasons why GEOM
> attributes can't be used this way?
>
> * For one-to-many transforms like gpart the correct behavior is less
> clear: what if a disk has two partitions in two different pools, and
> one of them is healthy but the other isn't?
>
> * Besides ZFS, are there any other systems that could take advantage?
>
> * SATA enclosures uses SGPIO instead of SES.  SGPIO is too limited,
> IMHO, to be of almost any use at all.  I suggest not even trying to
> make it work with this scheme.
>
> <Originally posted to freebsd-geom; reposting here for a larger audience>
>

old dreams
https://people.freebsd.org/~mav/Enclosure_Management_en.pdf

TODO:
– add support for SGPIO/I2C interfaces in more drivers,
– associate devices with enclosure slots for non-SAS
enclosures;
– implement in-kernel interface for the «enc» CAM driver;
– refactor GEOM::setstate API to handle full set of SES Array
Device Slot flags;
– explore possibility to make ZFS report disks statuses to
GEOM using that API.