[direct] Re: Strange system lockups - kernel saying disk error

Dave dave at g8kbv.demon.co.uk
Sun Jun 5 10:31:15 UTC 2011

Hi again.

Thanks for the reply.

Re the old disk drives.  I have several 10 year or older IDE drives here, 
from 2G to 40G in 24/7 use, and (so far!) they are happy.  I also have a 
few much newer SATA drives that fail to even spin up.

Like anything, there is a "Bath Tub" curve re hardware failure rates.  
Lots of infant mortality in a short period, that we rarely see because 
they fail at the factory.  Then a  l o  n   g    time wih low rates of 
problems, evnetualy slowly rising as things age, until it's truly cheaper 
to replace than repair.

I used to rebuild the old 14 inch drives as part of a past job.  One of 
the few tasks where no one complained of the time you took, as the more 
care you put in to cleaning out the debris of the last failure, then 
setting up and calibrating the head positioner etc, there was a 
measurable improvement in performance (low error rates.)  I learnt a lot 
doing that.

I found Gibson's Spinrite almost by accident, before I knew of the 
Security Now series (OK, Windows biased, but still relevant..)  But 
didn't get a copy 'till a friend's wife's PC "died" saying there was no 
OS to boot.  She ran a small business, and had no backups.  (Whats new?)  
I said I'd take a look but couldn't promise anything.

After a few evenings messing arround with one of the linux rescue disks, 
I realised the hard drive was more or less OK, but some of the data was 
corrupt, probably part of the boot loader or OS (Win2000 at that time.)

I knew from working on the big drives in the past, you could sometimes 
"tweak" the head position and force it to read data slightly off track, 
sometimes resulting in a good read, if not all the track, then part of 
it.  Do that enough, and with some manual memory editig tools, we 
sometimes stiched a small file back together that way, before scrapping 
the platters and rebuilding the drive.

Spinrite does that automaticaly, by moving off track to varying degrees, 
then seeking back again, and as all drives start to read before the head 
has fully settled, if you do that timed to the physical disk rotation, 
and do it enough times, you can with averaging and numerical analasys, 
plus the drives own ECC, rebuild the data sector by sector.

It does work the drives hard though, some laptops you have to run with 
the drive cover removed and an extra fan to keep the it cool!

Anyway, I bought a copy of Spinrite, after much thought, and lots of 
phone calls asking if I'd fixed it yet.   When I tried it on one of my 
own "good" drives, just to see how it worked.  It almost instantly said  
it found a bad sector on that, much to my surprise, "fixed" it, and yes, 
that old box then booted noticably faster, as the drive itself didn't 
need to repeatedly read and correct whatever it was to get a good copy.   
So I tried it on her laptop.

It took a couple of hours, but it reported a cluster of bad sectors early 
on, but managed to recover all the data, so it said.  The rest of the 
drive looked OK.

On taking out the bootable CD, and cycling the power, the machine booted 
like nothing was ever wrong with it!   Just in case, I backed up as much 
as I could to a CD drive, did some updates and the usual routine tidying 
etc, and gave it back to them, with the CD, and instructions to perhaps 
plan on replacing the drive if not the PC sometime (a Toshiba 4600) just 
in case.

Well, I had more free beers for the next two months than I could handle, 
plus a steady stream of similarly sick PC's etc, and that old laptop is 
now mine after they did eventualy replace it, and still working 24/7 
(dispite other problems, it had Coke spilt into it destroying the battery 
charge and management systems!)  It now run's a software defined radio 
for beacon monitoring.  This in fact...

Spinrite has also won me favors with many other people over the last 
couple of years.  So far, there is only one drive it couldn't work with.  
Oddly, from this machine I'm using now that had no outward issues, but 
after one update didn't boot, as C: had sudenly ceased to exist!  A 
Maxtor 40G IDE drive.

The drive itself fails to initialse and present itself correctly to the 
BIOS, so all on that is lost.  (I did have backups though!)

It's not the electronics card, I swaped that with the other known good 
one (two identical drives were in this machine) same problem.  I since 
learnt though, that most modern hard drives, boot their own firmware from 
what is effectvely cylinder -1, or off the other end of the list.  That 
is what is corrupt I suspect, but not even Spinrite can get to that to 
help, so I now have a desktop paperweight....

I can well understand why people are sceptical about it, after all, how 
can "Software" fix a "Hardware" problem.  But once it's seen to work 
people then just have to have their own copy.  I think I'm indirectly 
responsible for at least 4 extra sales, not that I get any commission, 

Just like the Linux based recovery and self contained AV disks, and also 
Memtest86, I carry a copy of Spinrite arround with me too.

I just wish I could come up with something as successful, and able to 
continue selling over and over...

As for changing mobo caps, it's not dificult, but it sure takes a lot of 
time and care.  Cap's in PSU's too go bad (Usually the Low Voltage ones) 
again, not dificult to change, but take care.  There's often considerable 
High Voltage stored in some places, that can bite you, and it hurts!

Lastly, large slow running fans last the longest, and are nice and quiet 
too.  Just regularly blow the "dust bunnies" out of the systems (two or 
three time a year?) and keep things like the CPU cooler and PSU clean, 
and your hardware will work for many years just fine.

Oh..  CPU coolers.  If your system has the ability to monitor the CPU 
temperature, get to know how that behaves depending on the software you 
use.  If it starts to slowly rise, but the room temperature is not 
correspondinlgy warmer, also cleaning the dust from the cooler doenst 
seem to help.  It may need the cooler removing, the old heat transfer 
compound removing and cleaning, and fresh compound using when you refit 
the cooler.   This issues seems worse with the earlier single core P4's, 
that had a very small contact area to the cooler.

At least Intel chips just slow down as they get hotter (cycle skipping) 
so as not to burn out.   Some AMD's will destroy themselves if the cooler 
fails!...    There is a YouTube video somewhere, showing a PC with an 
Intel CPU with no cooler getting slower and slower till it almost stops.

I hope you get things sorted out, one way or another.  Life is so much 
nicer if you don't have to keep messing with the blessed things!

I have a sick Land Rover to fix too.  Gearbox rear oil seal, also rear 
drive shaft UJ's.   At least I can use big hammers on that sometimes...   
(Therapy!)   Oh, the grass needs cutting, and I'm now also under 
instruction to change the bed, when the cat's finished sleeping on it!!!

Best Regards.  

Dave B.

On 4 Jun 2011 at 21:35, Kaya Saman wrote:

Subject:	Re: Strange system lockups - kernel saying disk error

> [...] 
>     Hmmm Hard drives do not like heat!   Check the PSU voltages with a
>     meter, for accuracy and ripple.  Failing SMPS's can do all sorts
>     of odd things.
>     Capacitor problems.  Been there done that.  They can be changed
>     for very low cost, other than your time.
>     DaveB
>     You might guess by know, I know far more about hardware than I do
>     about software, but for the latter to run well, the former must be
>     good.
>     _______________________________________________
>     freebsd-questions at freebsd.org mailing list
>     http://lists.freebsd.org/mailman/listinfo/freebsd-questions
>     To unsubscribe, send any mail to
>     "freebsd-questions-unsubscribe at freebsd.org"
> Many thanks Dave for all the suggestions!!!
> To be honest I think the drives are fine but the system is just soooo
> old including the IDE drives.
> I mean if I get a SATA/IDE USB adapter I should be able to backup the
> drives to the new DAS system I will have in place shortly since I am
> much more in favor of running Nexenta Core 3 OS with ZFS spanning the
> 16x drives meaning a total of 36TB with 2 internal drives used for
> logging and caching.
> Then this system will be obsolete. However, I will keep your
> suggestion of using spinwrite in mind next time I encounter issues!
> BTW I respect your H/W knowledge that's quite in deep :-) thank you
> for your insight.
> <just an observation demon.co.uk :-) used to be my old ISP til I went
> with Pipex which is now bust, then I moved out of the UK and now
> everything is roasting hot>
> Best regards,
> Kaya
> __________ NOD32 6175 (20110602) Information __________
> This message was checked by NOD32 antivirus system.
> http://www.eset.com

More information about the freebsd-questions mailing list