Drive Disconnection

mark.jacobs at custserv.com mark.jacobs at custserv.com
Fri Oct 24 18:39:37 PDT 2008


Thanks for all the great information. I'm going to try the USB solution for now since the drive was running fine for several months on this server w/USB until I began playing with the ESATA connection. 

If perchance USB doesn't work I will try both getting the SMART status from the drive and getting a better SATA controller.

Mark Jacobs
Technical Services
Time Customer Service, Tampa FL (Go Rays)

-----Original Message-----
From: Jeremy Chadwick [mailto:koitsu at FreeBSD.org]
Sent: Fri 10/24/2008 9:15 PM
To: Jacobs, Mark - Data Center Operations <mark.jacobs at custserv.com>
Cc: freebsd-questions at freebsd.org
Subject: Re: Drive Disconnection
 
On Fri, Oct 24, 2008 at 07:44:41PM -0400, mark.jacobs at custserv.com wrote:
> It is a Lacie d2 quadra drive but FreeBSD reports this;
> server kernel: ad4: 953869MB <Hitachi HDS721010KLA330 GKAOA70M> at ata2-master SATA150
> 
> When I perform the RSYNC I receive these errors
> 
> Oct 24 12:47:13 server kernel: ad4: FAILURE - device detached
> Oct 24 12:47:13 server kernel: subdisk4: detached
> Oct 24 12:47:13 server kernel: ad4: detached
> Oct 24 12:47:13 server kernel: g_vfs_done():ad4s1a[WRITE(offset=144332767232, length=131072)]error = 6
> Oct 24 12:47:13 server kernel: g_vfs_done():ad4s1a[WRITE(offset=144332898304, length=131072)]error = 6
> The write failure messages keep on being issued until the server reboots. It isn't in the log, but I receive a dirty buffer panic.

It appears the disk is literally falling off of the SATA bus.  The
g_vfs_done errors you see are a result of that.  I'll explain the
reboot in a moment.

There could be tons of reasons for the disk disappearing.  I'll list off
some the possibilities that come to mind:

* Drive losing power
  - Shoddiness inside of the d2 Quadra enclosure, such as bad internal
    cabling or manufacturing defects,
  - AC adapter for d2 Quadra is faulty,
  - d2 Quadra could offer some kind of "sleep mode" where the unit goes
    into a low-power-save state, and the disk ends up falling off the
    bus during this time.

* SATA300 vs. SATA150 compatibility issues
  - VIA and SiS chipsets are known to experience data corruption, disks
    falling off the bus, or other insanity when SATA300 disks are
    connected to those chipsets.  The chipsets support SATA300, but are
    downright buggy.  Workaround is to force the drive to SATA150 speed
    using jumpers on the disk (only *some* manufacturers offer this),
  - The Hitachi disk in your d2 Quadra is spec'd at SATA300, while it's
    obvious your Silicon Image SATA controller is only detecting
    SATA150 (yet LaCie claims this enclosure does SATA300).  The 7K1000
    series drives *do not* have a force-SATA150 jumper (I've checked),
    which is too bad, since forcing SATA150 might fix the problem.

* d2 Quadra USB/FW/eSATA controller bug
  - I have no idea what chip is inside of that enclosure, but many of
    them are "bridges", e.g. they're USB/FW controllers that have a
    horribly shoddy "SATA emulation" interface on top of them,
  - Could be a firmware bug with the controller used in the enclosure,
  - Controller may not be 100% compatible with Silicon Image devices.

* Silicon Image SATA controller bugs

As for why the system reboots: what you're experiencing is probably a
kernel panic.  On FreeBSD, when you have a filesystem that's mounted and
the underlying device (disk, etc.) is yanked out from underneath, the
kernel will panic; this is by design.  I've been told by lower-level
folks that CURRENT supposedly addresses this issue, but I haven't
personally confirmed it.

I would still like to see SMART stats on the drive.  Why?  Because SMART
stats will show me if the drive is actually losing power or not (the
Power_Cycle_Count attribute should increment).

You'll need to install ports/sysutils/smartmontools, then run "smartctl
-a /dev/ad4".  Save that data somewhere, then run your rsync.  Your
machine will reboot (a soft reset, hopefully!), and once it's back up,
run the same smartctl command again, and save that data.  Then you can
compare the adjusted attributes and RAW_VALUEs; I can help you with
reading this data if need be (people often misread it).

> I don't have easy access to a 7.1 system with an ESATA port.

That's disappointing, as it would be useful to know if 7.1-PRERELEASE
behaves the same way for you.  Based on the above I'd say it probably
does, but it's always good to check.

> I'm current redoing the entire process, wipe, build filesystem, mount,
> rsync using the USB port. If that works I'm going to junk the idea of
> using the ESATA card for the drive.

I would _highly_ recommend you reconsider this.  USB on FreeBSD is in an
even worse state (and I am not exaggerating) than ATA/SATA is.  If your
disk is falling off the bus with SATA, the same will likely happen with
USB, and you'll experience the same problem.

> Can you recommend an ESATA card that fits in an PCI slot since my
> server doesn't have a PCI-E slot?

Promise makes the SATA300 TX4302 controller, which is PCI, and provides
two eSATA ports, plus two internal SATA ports.  I believe this card goes
for US$70-100.  Promise's website (for me) appears to be malfunctioning
(webserver answers, but stalls indefinitely), so I can't easily check
their products list.

I don't think HighPoint makes any eSATA-capable controllers that are
standard PCI or PCI-X; all appear to be PCI Express.

If your motherboard has on-board SATA support that *does not* use a
Silicon Image, VIA, or SiS chip and instead something like an Intel ICH
or nVidia nForce controller, I would recommend buying something like
this and using it instead:

http://www.icydock.com/product/MB559power_bracket.html
http://www.cooldrives.com/essaii3gbexp.html
http://www.newegg.com/Product/Product.aspx?Item=N82E16812119021

Finally, and I don't know if you're doing this, but -- be aware you
can't "hot-swap" disks via eSATA without having a hot-swap-capable
controller that fully supports hot-swapping.  Meaning: you can't yank
that d2 Quadra enclosure off the eSATA port whenever you feel like it.
You'll need to use "atacontrol detach" to properly detach it first, and
that's assuming the SATA controller you're using supports hot-swapping
(things with AHCI behave fairly well in this regard).

> -----Original Message-----
> From: Jeremy Chadwick [mailto:koitsu at FreeBSD.org]
> Sent: Fri 10/24/2008 7:09 PM
> To: Jacobs, Mark - Data Center Operations <mark.jacobs at custserv.com>
> Cc: freebsd-questions at freebsd.org
> Subject: Re: Drive Disconnection
>  
> On Fri, Oct 24, 2008 at 02:02:41PM -0400, Mark Jacobs wrote:
> > I have an external Lacie 1Tb drive attached to a FreeBSD 6.4-PRERELEASE
> > system via an ESATA connection.
> > 
> > atapci0: <SiI SiI 3512 SATA150 controller>
> > 
> > I cleaned off the drive by writing random data to it. The write took
> > overnight and didn't experience any problems. I then added a filesystem
> > to the drive and mounted it on the system.
> > 
> > However when I perform an rsync backup from a FreeBSD 7.1 PRERELEASE
> > system to the drive over an NFS connection the drive disconnects and the
> > server reboots.
> 
> You've not provided enough information to help track this down.  What
> model/brand of disk is attached to that controller?  What does smartctl
> -a have to say about the disk?  What gets printed on the console before
> it reboots?  Do you have the same problem if you run
> 7.1-PRERELEASE/BETA2?
> 
> > Does anyone have an idea where to go from here?
> 
> The only generic advice I can give you at this point) is to avoid
> Silicon Image controllers, particularly their SATA controllers.  They
> have a history of causing data corruption on Linux, FreeBSD, and
> Windows, and some have reported other miscellaneous problems with them
> as well.  There's not enough evidence in this thread so far to blame the
> SiI controller, but when I see them, I become immediately suspicious.
> 
> -- 
> | Jeremy Chadwick                                jdc at parodius.com |
> | Parodius Networking                       http://www.parodius.com/ |
> | UNIX Systems Administrator                  Mountain View, CA, USA |
> | Making life hard for others since 1977.              PGP: 4BD6C0CB |
> 
> 
> _______________________________________________
> freebsd-questions at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "freebsd-questions-unsubscribe at freebsd.org"

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |




More information about the freebsd-questions mailing list