For a first time completed S.M.A.R.T captive test
Domagoj Smolčić
rank1seeker at gmail.com
Tue Jul 16 17:09:43 UTC 2019
11.2-RELEASE-p9
From the first time I started to use FreeBSD and upon to just recently, with smartmontools, I have NEVER successfully completed captive test.
No matter which HDD or smartmontools version I used, upon initiating 'Extended captive' test, I would ALWAYS get error: 'Interrupted (host reset)'
This implies nothing is being mounted from device, so only it's node exist in /dev/ and nothing "chats" with it except kernel.
Stopping smartd service also didn't help.
Searching on the internet, I have never found anyone succeeding with it.
Just a "solutions" that it should never be used?!
So I started to think a little bit out of the box ...
HDD has it's OWN board with it's OWN BIOS + firmware, which actually holds S.M.A.R.T version/ability and IT executes issued test from OS, using it's own firmware to actually run a test.
Once HDD receives test request from OS, HDD doesn't need OS at all!
So, in order to get rid of a results like:
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 2 Extended captive Interrupted (host reset) 90% 40743 -
And suspecting OS (kernel?!) is pestering HDD during it's captive test, thus interrupting it, AS SOON as captive CMD is issued and hangs occurs (it is too late when hang passes by itself!), I've pulled out SATA DATA cable and left SATA POWER cable attached.
Hang is stopped as soon as SATA DATA cable is unplugged and it's used only to transfer test request anyway to HDD and all HDD needs from that point on, is JUST a power and it's "piece of mind"!
RESULT:
--
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended captive Completed without error 00% 40744 -
# 2 Extended captive Interrupted (host reset) 90% 40743 -
--
FINALLY! ==> '# 1 Extended captive Completed without error'
So ..., what to conclude from this?
Does kernel really must "chat" with HDD in order to keep alive it's device node in /dev/ or is it something else?
If HDD supports captive test and during it, why it simply doesn't ignore OS/kernel (it is up to HDD's firmware code to make that decision).
Is this, I'm not even sure how to name it ..., a borderline bug?
Anyway, it is a little bit "impractical" to use terminal with one hand and with other to pull out SATA data cable.
Domagoj Smolčić
More information about the freebsd-hackers
mailing list