ahci.ko / geom_mirror / zfs hangs up system when one of HDDs fauilts.

Lev Serebryakov lev at FreeBSD.org
Tue Jul 19 21:39:03 UTC 2011


Hello, Freebsd-hardware.

  I've have two identical live locks when HDD becomes broken on
8.2-STABLE system with two SATA HDDs withgmirror and ZFS on them.

  It is Hetzner-based server, so only access I have is LARA console,
but symptoms are identical in both cases: HDD becomes bad, ahci.ko
complains about timeouts, and after that server stops to respond on
high-level access attempts (ssh/HTTP/SMTP), but can be pinged both
with IPv4 and IPv6 addresses.

 HDDs are identical, and they are splitted into several (BSD)partions.
Some partitions are mirrired with geom_mirror and one pair of
partitions are added to (mirrored) ZFS pool like this (I proved output
on rebooted one-HDD-only system, but, I think, it is clear how it
looks when both HDDs are Ok):

===================
# gmirror status
onlyone# gmirror status
            Name    Status  Components
     mirror/root  DEGRADED  ada0s1a
      mirror/var  DEGRADED  ada0s1d
      mirror/tmp  DEGRADED  ada0s1e
      mirror/usr  DEGRADED  ada0s1f
mirror/databases  DEGRADED  ada0s1g
# zpool status
  pool: pool
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: none requested
config:

        NAME         STATE     READ WRITE CKSUM
        pool         DEGRADED     0     0     0
          mirror     DEGRADED     0     0     0
            ada0s1h  ONLINE       0     0     0
            ada0s1h  UNAVAIL      0     0     0  cannot open

errors: No known data errors
===================

 Screenshot of LARA console in such case is attached.

-- 
// Black Lion AKA Lev Serebryakov <lev at FreeBSD.org>


More information about the freebsd-hardware mailing list