[direct] Re: Strange system lockups - kernel saying disk error
dave at g8kbv.demon.co.uk
Sun Jun 5 10:31:15 UTC 2011
Thanks for the reply.
Re the old disk drives. I have several 10 year or older IDE drives here,
from 2G to 40G in 24/7 use, and (so far!) they are happy. I also have a
few much newer SATA drives that fail to even spin up.
Like anything, there is a "Bath Tub" curve re hardware failure rates.
Lots of infant mortality in a short period, that we rarely see because
they fail at the factory. Then a l o n g time wih low rates of
problems, evnetualy slowly rising as things age, until it's truly cheaper
to replace than repair.
I used to rebuild the old 14 inch drives as part of a past job. One of
the few tasks where no one complained of the time you took, as the more
care you put in to cleaning out the debris of the last failure, then
setting up and calibrating the head positioner etc, there was a
measurable improvement in performance (low error rates.) I learnt a lot
I found Gibson's Spinrite almost by accident, before I knew of the
Security Now series (OK, Windows biased, but still relevant..) But
didn't get a copy 'till a friend's wife's PC "died" saying there was no
OS to boot. She ran a small business, and had no backups. (Whats new?)
I said I'd take a look but couldn't promise anything.
After a few evenings messing arround with one of the linux rescue disks,
I realised the hard drive was more or less OK, but some of the data was
corrupt, probably part of the boot loader or OS (Win2000 at that time.)
I knew from working on the big drives in the past, you could sometimes
"tweak" the head position and force it to read data slightly off track,
sometimes resulting in a good read, if not all the track, then part of
it. Do that enough, and with some manual memory editig tools, we
sometimes stiched a small file back together that way, before scrapping
the platters and rebuilding the drive.
Spinrite does that automaticaly, by moving off track to varying degrees,
then seeking back again, and as all drives start to read before the head
has fully settled, if you do that timed to the physical disk rotation,
and do it enough times, you can with averaging and numerical analasys,
plus the drives own ECC, rebuild the data sector by sector.
It does work the drives hard though, some laptops you have to run with
the drive cover removed and an extra fan to keep the it cool!
Anyway, I bought a copy of Spinrite, after much thought, and lots of
phone calls asking if I'd fixed it yet. When I tried it on one of my
own "good" drives, just to see how it worked. It almost instantly said
it found a bad sector on that, much to my surprise, "fixed" it, and yes,
that old box then booted noticably faster, as the drive itself didn't
need to repeatedly read and correct whatever it was to get a good copy.
So I tried it on her laptop.
It took a couple of hours, but it reported a cluster of bad sectors early
on, but managed to recover all the data, so it said. The rest of the
drive looked OK.
On taking out the bootable CD, and cycling the power, the machine booted
like nothing was ever wrong with it! Just in case, I backed up as much
as I could to a CD drive, did some updates and the usual routine tidying
etc, and gave it back to them, with the CD, and instructions to perhaps
plan on replacing the drive if not the PC sometime (a Toshiba 4600) just
Well, I had more free beers for the next two months than I could handle,
plus a steady stream of similarly sick PC's etc, and that old laptop is
now mine after they did eventualy replace it, and still working 24/7
(dispite other problems, it had Coke spilt into it destroying the battery
charge and management systems!) It now run's a software defined radio
for beacon monitoring. This in fact...
Spinrite has also won me favors with many other people over the last
couple of years. So far, there is only one drive it couldn't work with.
Oddly, from this machine I'm using now that had no outward issues, but
after one update didn't boot, as C: had sudenly ceased to exist! A
Maxtor 40G IDE drive.
The drive itself fails to initialse and present itself correctly to the
BIOS, so all on that is lost. (I did have backups though!)
It's not the electronics card, I swaped that with the other known good
one (two identical drives were in this machine) same problem. I since
learnt though, that most modern hard drives, boot their own firmware from
what is effectvely cylinder -1, or off the other end of the list. That
is what is corrupt I suspect, but not even Spinrite can get to that to
help, so I now have a desktop paperweight....
I can well understand why people are sceptical about it, after all, how
can "Software" fix a "Hardware" problem. But once it's seen to work
people then just have to have their own copy. I think I'm indirectly
responsible for at least 4 extra sales, not that I get any commission,
Just like the Linux based recovery and self contained AV disks, and also
Memtest86, I carry a copy of Spinrite arround with me too.
I just wish I could come up with something as successful, and able to
continue selling over and over...
As for changing mobo caps, it's not dificult, but it sure takes a lot of
time and care. Cap's in PSU's too go bad (Usually the Low Voltage ones)
again, not dificult to change, but take care. There's often considerable
High Voltage stored in some places, that can bite you, and it hurts!
Lastly, large slow running fans last the longest, and are nice and quiet
too. Just regularly blow the "dust bunnies" out of the systems (two or
three time a year?) and keep things like the CPU cooler and PSU clean,
and your hardware will work for many years just fine.
Oh.. CPU coolers. If your system has the ability to monitor the CPU
temperature, get to know how that behaves depending on the software you
use. If it starts to slowly rise, but the room temperature is not
correspondinlgy warmer, also cleaning the dust from the cooler doenst
seem to help. It may need the cooler removing, the old heat transfer
compound removing and cleaning, and fresh compound using when you refit
the cooler. This issues seems worse with the earlier single core P4's,
that had a very small contact area to the cooler.
At least Intel chips just slow down as they get hotter (cycle skipping)
so as not to burn out. Some AMD's will destroy themselves if the cooler
fails!... There is a YouTube video somewhere, showing a PC with an
Intel CPU with no cooler getting slower and slower till it almost stops.
I hope you get things sorted out, one way or another. Life is so much
nicer if you don't have to keep messing with the blessed things!
I have a sick Land Rover to fix too. Gearbox rear oil seal, also rear
drive shaft UJ's. At least I can use big hammers on that sometimes...
(Therapy!) Oh, the grass needs cutting, and I'm now also under
instruction to change the bed, when the cat's finished sleeping on it!!!
On 4 Jun 2011 at 21:35, Kaya Saman wrote:
Subject: Re: Strange system lockups - kernel saying disk error
> Hmmm Hard drives do not like heat! Check the PSU voltages with a
> meter, for accuracy and ripple. Failing SMPS's can do all sorts
> of odd things.
> Capacitor problems. Been there done that. They can be changed
> for very low cost, other than your time.
> You might guess by know, I know far more about hardware than I do
> about software, but for the latter to run well, the former must be
> freebsd-questions at freebsd.org mailing list
> To unsubscribe, send any mail to
> "freebsd-questions-unsubscribe at freebsd.org"
> Many thanks Dave for all the suggestions!!!
> To be honest I think the drives are fine but the system is just soooo
> old including the IDE drives.
> I mean if I get a SATA/IDE USB adapter I should be able to backup the
> drives to the new DAS system I will have in place shortly since I am
> much more in favor of running Nexenta Core 3 OS with ZFS spanning the
> 16x drives meaning a total of 36TB with 2 internal drives used for
> logging and caching.
> Then this system will be obsolete. However, I will keep your
> suggestion of using spinwrite in mind next time I encounter issues!
> BTW I respect your H/W knowledge that's quite in deep :-) thank you
> for your insight.
> <just an observation demon.co.uk :-) used to be my old ISP til I went
> with Pipex which is now bust, then I moved out of the UK and now
> everything is roasting hot>
> Best regards,
> __________ NOD32 6175 (20110602) Information __________
> This message was checked by NOD32 antivirus system.
More information about the freebsd-questions