[direct] Re: Strange system lockups - kernel saying disk error

Kaya Saman kayasaman at gmail.com
Sun Jun 5 20:45:26 UTC 2011


On 06/05/2011 01:29 PM, Dave wrote:
> Hi again.
>
> Thanks for the reply.
>
> Re the old disk drives.  I have several 10 year or older IDE drives here,
> from 2G to 40G in 24/7 use, and (so far!) they are happy.  I also have a
> few much newer SATA drives that fail to even spin up.
>
> Like anything, there is a "Bath Tub" curve re hardware failure rates.
> Lots of infant mortality in a short period, that we rarely see because
> they fail at the factory.  Then a  l o  n   g    time wih low rates of
> problems, evnetualy slowly rising as things age, until it's truly cheaper
> to replace than repair.
>
> I used to rebuild the old 14 inch drives as part of a past job.  One of
> the few tasks where no one complained of the time you took, as the more
> care you put in to cleaning out the debris of the last failure, then
> setting up and calibrating the head positioner etc, there was a
> measurable improvement in performance (low error rates.)  I learnt a lot
> doing that.
>
> I found Gibson's Spinrite almost by accident, before I knew of the
> Security Now series (OK, Windows biased, but still relevant..)  But
> didn't get a copy 'till a friend's wife's PC "died" saying there was no
> OS to boot.  She ran a small business, and had no backups.  (Whats new?)
> I said I'd take a look but couldn't promise anything.
>
> After a few evenings messing arround with one of the linux rescue disks,
> I realised the hard drive was more or less OK, but some of the data was
> corrupt, probably part of the boot loader or OS (Win2000 at that time.)
>
> I knew from working on the big drives in the past, you could sometimes
> "tweak" the head position and force it to read data slightly off track,
> sometimes resulting in a good read, if not all the track, then part of
> it.  Do that enough, and with some manual memory editig tools, we
> sometimes stiched a small file back together that way, before scrapping
> the platters and rebuilding the drive.
>
> Spinrite does that automaticaly, by moving off track to varying degrees,
> then seeking back again, and as all drives start to read before the head
> has fully settled, if you do that timed to the physical disk rotation,
> and do it enough times, you can with averaging and numerical analasys,
> plus the drives own ECC, rebuild the data sector by sector.
>
> It does work the drives hard though, some laptops you have to run with
> the drive cover removed and an extra fan to keep the it cool!
>
> Anyway, I bought a copy of Spinrite, after much thought, and lots of
> phone calls asking if I'd fixed it yet.   When I tried it on one of my
> own "good" drives, just to see how it worked.  It almost instantly said
> it found a bad sector on that, much to my surprise, "fixed" it, and yes,
> that old box then booted noticably faster, as the drive itself didn't
> need to repeatedly read and correct whatever it was to get a good copy.
> So I tried it on her laptop.
>
> It took a couple of hours, but it reported a cluster of bad sectors early
> on, but managed to recover all the data, so it said.  The rest of the
> drive looked OK.
>
> On taking out the bootable CD, and cycling the power, the machine booted
> like nothing was ever wrong with it!   Just in case, I backed up as much
> as I could to a CD drive, did some updates and the usual routine tidying
> etc, and gave it back to them, with the CD, and instructions to perhaps
> plan on replacing the drive if not the PC sometime (a Toshiba 4600) just
> in case.
>
> Well, I had more free beers for the next two months than I could handle,
> plus a steady stream of similarly sick PC's etc, and that old laptop is
> now mine after they did eventualy replace it, and still working 24/7
> (dispite other problems, it had Coke spilt into it destroying the battery
> charge and management systems!)  It now run's a software defined radio
> for beacon monitoring.  This in fact...
> http://g8kbv.homeip.net:8008/60m/ral-at-wbx.html
>
> Spinrite has also won me favors with many other people over the last
> couple of years.  So far, there is only one drive it couldn't work with.
> Oddly, from this machine I'm using now that had no outward issues, but
> after one update didn't boot, as C: had sudenly ceased to exist!  A
> Maxtor 40G IDE drive.
>
> The drive itself fails to initialse and present itself correctly to the
> BIOS, so all on that is lost.  (I did have backups though!)
>
> It's not the electronics card, I swaped that with the other known good
> one (two identical drives were in this machine) same problem.  I since
> learnt though, that most modern hard drives, boot their own firmware from
> what is effectvely cylinder -1, or off the other end of the list.  That
> is what is corrupt I suspect, but not even Spinrite can get to that to
> help, so I now have a desktop paperweight....
>
> I can well understand why people are sceptical about it, after all, how
> can "Software" fix a "Hardware" problem.  But once it's seen to work
> people then just have to have their own copy.  I think I'm indirectly
> responsible for at least 4 extra sales, not that I get any commission,
> sadly...
>
> Just like the Linux based recovery and self contained AV disks, and also
> Memtest86, I carry a copy of Spinrite arround with me too.
>
> I just wish I could come up with something as successful, and able to
> continue selling over and over...
>
> As for changing mobo caps, it's not dificult, but it sure takes a lot of
> time and care.  Cap's in PSU's too go bad (Usually the Low Voltage ones)
> again, not dificult to change, but take care.  There's often considerable
> High Voltage stored in some places, that can bite you, and it hurts!
>
> Lastly, large slow running fans last the longest, and are nice and quiet
> too.  Just regularly blow the "dust bunnies" out of the systems (two or
> three time a year?) and keep things like the CPU cooler and PSU clean,
> and your hardware will work for many years just fine.
>
> Oh..  CPU coolers.  If your system has the ability to monitor the CPU
> temperature, get to know how that behaves depending on the software you
> use.  If it starts to slowly rise, but the room temperature is not
> correspondinlgy warmer, also cleaning the dust from the cooler doenst
> seem to help.  It may need the cooler removing, the old heat transfer
> compound removing and cleaning, and fresh compound using when you refit
> the cooler.   This issues seems worse with the earlier single core P4's,
> that had a very small contact area to the cooler.
>
> At least Intel chips just slow down as they get hotter (cycle skipping)
> so as not to burn out.   Some AMD's will destroy themselves if the cooler
> fails!...    There is a YouTube video somewhere, showing a PC with an
> Intel CPU with no cooler getting slower and slower till it almost stops.
>
> I hope you get things sorted out, one way or another.  Life is so much
> nicer if you don't have to keep messing with the blessed things!
>
> I have a sick Land Rover to fix too.  Gearbox rear oil seal, also rear
> drive shaft UJ's.   At least I can use big hammers on that sometimes...
> (Therapy!)   Oh, the grass needs cutting, and I'm now also under
> instruction to change the bed, when the cat's finished sleeping on it!!!
>
> Best Regards.
>
> Dave B.
>
>
> On 4 Jun 2011 at 21:35, Kaya Saman wrote:
>
> Subject:	Re: Strange system lockups - kernel saying disk error
>
>    
>> [...]
>>
>>
>>
>>      Hmmm Hard drives do not like heat!   Check the PSU voltages with a
>>      meter, for accuracy and ripple.  Failing SMPS's can do all sorts
>>      of odd things.
>>
>>      Capacitor problems.  Been there done that.  They can be changed
>>      for very low cost, other than your time.
>>
>>      DaveB
>>
>>      You might guess by know, I know far more about hardware than I do
>>      about software, but for the latter to run well, the former must be
>>      good.
>>
>>      _______________________________________________
>>      freebsd-questions at freebsd.org mailing list
>>      http://lists.freebsd.org/mailman/listinfo/freebsd-questions
>>      To unsubscribe, send any mail to
>>      "freebsd-questions-unsubscribe at freebsd.org"
>>
>>
>> Many thanks Dave for all the suggestions!!!
>>
>> To be honest I think the drives are fine but the system is just soooo
>> old including the IDE drives.
>>
>> I mean if I get a SATA/IDE USB adapter I should be able to backup the
>> drives to the new DAS system I will have in place shortly since I am
>> much more in favor of running Nexenta Core 3 OS with ZFS spanning the
>> 16x drives meaning a total of 36TB with 2 internal drives used for
>> logging and caching.
>>
>> Then this system will be obsolete. However, I will keep your
>> suggestion of using spinwrite in mind next time I encounter issues!
>>
>> BTW I respect your H/W knowledge that's quite in deep :-) thank you
>> for your insight.
>>
>> <just an observation demon.co.uk :-) used to be my old ISP til I went
>> with Pipex which is now bust, then I moved out of the UK and now
>> everything is roasting hot>
>>
>>
>> Best regards,
>>
>>
>> Kaya
>>
>>
>> __________ NOD32 6175 (20110602) Information __________
>>
>> This message was checked by NOD32 antivirus system.
>> http://www.eset.com
>>
>>      
>
> _______________________________________________
> freebsd-questions at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "freebsd-questions-unsubscribe at freebsd.org"
>    

Thanks Dave for this very graphic and insightful story :-)

It was a pleasure to read and a nice display of how experience really 
does prevail over things!!!


I liked the radio chart on the site provided :-) - what exactly is it 
measuring? Background noise?


I think not having a UPS for over a year killed me with the power 
cutting out almost every weekend for 10 - 20 minutes/night. Now I have 
UPS, 2x 1500KVA APC systems... nice but need the network and temp 
monitoring cards. Need plenty of £££ for that! Plus the new server I am 
intending to build as the DAS box already cost $2000.


Regards,


Kaya


More information about the freebsd-questions mailing list