AHCI timeouts on S3 resume
Damian Gerow
dgerow at afflictions.org
Wed May 19 13:48:55 UTC 2010
Jeremy Chadwick wrote:
: On Tue, May 18, 2010 at 10:14:03PM -0400, Damian Gerow wrote:
: > A few months back, I swapped out my dying hard drive for a WD Scorpio Blue.
: > Cheap, seemed reliable, and it was the only drive the local shop had in
: > stock. However, it seems that AHCI doesn't like this device, and is having
: > troubles during an S3 resume. It appears as though I'm experiencing two
: > types of timeouts when resuming: recoverable, and non-recoverable.
: >
: > My question is: do I have a bad HDD, or is AHCI just not playing nicely?
:
: Your hard disk looks generally OK; it isn't going bad. The one thing I
: can't tell or not is whether the disk is actually spinning back up on
: resume; you'd have to literally listen for it, or look at SMART
: Attribute #4 before and after a suspend/resume. I'll discuss analysis
: of SMART statistics further down.
The disk spins back up immediately on resume. I have no recollection of it
/not/ doing so (it's definitely noticable), and I just confirmed it with a
few S3 cycles.
I also checked the WD spec sheet, and the average drive ready time is 4s.
: I will point out, however, that you've set this value in loader.conf:
:
: > hw.pci.do_power_nodriver="2"
:
: I've read the sysctl -d description for it, but I am not familiar with
: sleep/power states so I don't know the implications. I worry that this
: value may be causing problems with your ICH9 controller. If you could
: comment this out and re-try suspend/resume to see if AHCI times out, you
: might determine if it's responsible for the problem.
That *should* just remove power from devices without a driver. But I
removed it, rebooted, went through two S3 cycles, and I'm still seeing the
timeouts. (Recoverable; of the two cycles I did, I didn't see a
non-recoverable timeout.)
: > The HDD is a WD Scorpio blue, model WD5000BEVT-22A0RT0, and isn't exactly
: > the fastest drive on the planet. SMART seems to be relatively clean, with
: > some mild questions surrounding attributes 191, 9/193, and 194:
: >
: > -----
: > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
: > 3 Spin_Up_Time 0x0027 186 185 021 Pre-fail Always - 1675
: > 4 Start_Stop_Count 0x0032 055 055 000 Old_age Always - 45174
: > 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 723
: > 191 G-Sense_Error_Rate 0x0032 072 072 000 Old_age Always - 28
: > 193 Load_Cycle_Count 0x0032 162 162 000 Old_age Always - 115712
: > 194 Temperature_Celsius 0x0022 112 106 000 Old_age Always - 35
: > -----
: Attribute #9 indicates the total amount of time the hard disk has been
: powered on (read: not asleep) during its lifetime. I can't tell you
: whether or not this value is correct; only you would be able to
: determine that, given your usage patterns. I *have* seen desktop drives
: which have reported this value incorrectly (meaning, servers I know have
: been on for thousands of hours that show "4" for this RAW_VALUE;
: probably a firmware bug).
I combined attributes 9 and 193 together because it seems like a load cycle
count of ~116k with 723 power-on hours is a bit high. I believe laptop HDDs
are designed to handle a higher rate of load cycle counts, but I've never
really paid attention to them -- save on my previously dying drive, which
had broken 1M, and started screeching when doing some seeks.
But yes, that 723 power-on hours seems accurate.
: Attribute #193 indicates the number of times the actuator arm (thus
: heads) has been parked or come out of being parked. There is a known
: problem with some models of WD "Green Power" (GP) drives where the drive
: spends an excessive amount of time parking, and this counter increases
: rapidly. One FreeBSD user who reported this problem to Western Digital
: received a replacement firmware which addressed the problem. The WD
: Scorpio Blue drives (or some of them) may have this same problem --
: HOWEVER, this model of hard disk (2.5" FF) is *specifically* intended
: for laptops and low-power environments, so the behaviour seen in this
: case could be 100% normal. WD would hopefully know.
I'm fairly certain that WD only includes that IntelliPark feature on the GP
drives. At least, WD doesn't indicate that there's any of their fancy new
GP-related tricks on the Scorpio Blue line.
I'd actually recently dropped my vfs.zfs.txg.timeout to 5, as I was
experiencing some pretty horrible stalls when it was left at default (30, I
believe). I was curious to see if this decreased the rate of my
Load_Cycle_Count, but I'm already at ~122k. Given that this drive is rated
to handle 600k, it makes me wonder if there isn't something like IntelliPark
on this drive.
: Hope this helps.
Aye. It confirms that SMART clears my drive -- thanks!
More information about the freebsd-stable
mailing list