AHCI timeouts on S3 resume

Damian Gerow dgerow at afflictions.org
Wed May 19 13:48:55 UTC 2010


Jeremy Chadwick wrote:
: On Tue, May 18, 2010 at 10:14:03PM -0400, Damian Gerow wrote:
: > A few months back, I swapped out my dying hard drive for a WD Scorpio Blue.
: > Cheap, seemed reliable, and it was the only drive the local shop had in
: > stock.  However, it seems that AHCI doesn't like this device, and is having
: > troubles during an S3 resume.  It appears as though I'm experiencing two
: > types of timeouts when resuming: recoverable, and non-recoverable.
: > 
: > My question is: do I have a bad HDD, or is AHCI just not playing nicely?
: 
: Your hard disk looks generally OK; it isn't going bad.  The one thing I
: can't tell or not is whether the disk is actually spinning back up on
: resume; you'd have to literally listen for it, or look at SMART
: Attribute #4 before and after a suspend/resume.  I'll discuss analysis
: of SMART statistics further down.

The disk spins back up immediately on resume.  I have no recollection of it
/not/ doing so (it's definitely noticable), and I just confirmed it with a
few S3 cycles.

I also checked the WD spec sheet, and the average drive ready time is 4s.

: I will point out, however, that you've set this value in loader.conf:
: 
: > hw.pci.do_power_nodriver="2"
: 
: I've read the sysctl -d description for it, but I am not familiar with
: sleep/power states so I don't know the implications.  I worry that this
: value may be causing problems with your ICH9 controller.  If you could
: comment this out and re-try suspend/resume to see if AHCI times out, you
: might determine if it's responsible for the problem.

That *should* just remove power from devices without a driver.  But I
removed it, rebooted, went through two S3 cycles, and I'm still seeing the
timeouts.  (Recoverable; of the two cycles I did, I didn't see a
non-recoverable timeout.)

: > The HDD is a WD Scorpio blue, model WD5000BEVT-22A0RT0, and isn't exactly
: > the fastest drive on the planet.  SMART seems to be relatively clean, with
: > some mild questions surrounding attributes 191, 9/193, and 194:
: > 
: > -----
: > ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
: >   3 Spin_Up_Time            0x0027   186   185   021    Pre-fail  Always       -       1675
: >   4 Start_Stop_Count        0x0032   055   055   000    Old_age   Always       -       45174
: >   9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       723
: > 191 G-Sense_Error_Rate      0x0032   072   072   000    Old_age   Always       -       28
: > 193 Load_Cycle_Count        0x0032   162   162   000    Old_age   Always       -       115712
: > 194 Temperature_Celsius     0x0022   112   106   000    Old_age   Always       -       35
: > -----

: Attribute #9 indicates the total amount of time the hard disk has been
: powered on (read: not asleep) during its lifetime.  I can't tell you
: whether or not this value is correct; only you would be able to
: determine that, given your usage patterns.  I *have* seen desktop drives
: which have reported this value incorrectly (meaning, servers I know have
: been on for thousands of hours that show "4" for this RAW_VALUE;
: probably a firmware bug).

I combined attributes 9 and 193 together because it seems like a load cycle
count of ~116k with 723 power-on hours is a bit high.  I believe laptop HDDs
are designed to handle a higher rate of load cycle counts, but I've never
really paid attention to them -- save on my previously dying drive, which
had broken 1M, and started screeching when doing some seeks.

But yes, that 723 power-on hours seems accurate.

: Attribute #193 indicates the number of times the actuator arm (thus
: heads) has been parked or come out of being parked.  There is a known
: problem with some models of WD "Green Power" (GP) drives where the drive
: spends an excessive amount of time parking, and this counter increases
: rapidly.  One FreeBSD user who reported this problem to Western Digital
: received a replacement firmware which addressed the problem.  The WD
: Scorpio Blue drives (or some of them) may have this same problem --
: HOWEVER, this model of hard disk (2.5" FF) is *specifically* intended
: for laptops and low-power environments, so the behaviour seen in this
: case could be 100% normal.  WD would hopefully know.

I'm fairly certain that WD only includes that IntelliPark feature on the GP
drives.  At least, WD doesn't indicate that there's any of their fancy new
GP-related tricks on the Scorpio Blue line.

I'd actually recently dropped my vfs.zfs.txg.timeout to 5, as I was
experiencing some pretty horrible stalls when it was left at default (30, I
believe).  I was curious to see if this decreased the rate of my
Load_Cycle_Count, but I'm already at ~122k.  Given that this drive is rated
to handle 600k, it makes me wonder if there isn't something like IntelliPark
on this drive.

: Hope this helps.

Aye.  It confirms that SMART clears my drive -- thanks!


More information about the freebsd-stable mailing list