How to recover data from dead hard drive.

Valeri Galtsev galtsev at kicp.uchicago.edu
Thu Oct 19 15:02:09 UTC 2017


On Thu, October 19, 2017 3:47 am, Frank Leonhardt (m) wrote:
>
>
> On 14 October 2017 11:14:26 BST, Carmel NY <carmel_ny at outlook.com> wrote:
>>On Fri, 13 Oct 2017 12:03:11 +0100, Frank Leonhardt (m) stated:
>>
>>>Good list that. ddrescue is, IME, the place to start.
>
>>>I also have some expensive licensed software for recovering incomplete
>>file
>>>systems, and charge Windows users like a wounded rhinoceros when they
>>need
>>>their data back.
>>
>>Years ago, I used "Spinrite" to recover a damaged drive. It worked
>>better than
>>anything else on the market, at least for me, at the time. I don't
>>think it
>>has been maintained in over a decade though.
>>
>>Personally, I fail to understand why anyone with any "mission critical"
>>system would not be using some form of RAID. It doesn't make any sense
>>to me.
>>Even my Laptop is configured to automatically back up data to a cloud
>>service.
>>Even if the drive went south, I could restore all of my data.
>
> I can explain why people aren't using RAID... IME It's because they think
> they are. But they do it wrong, and only find out when things go wrong.
>
> Most if the disasters I deal with involve "hardware" RAID cards. I won't
> single out PERC or MegaRAID because that wouldn't be fair.

Hm... My mileage is different. I use hardware RAIDs a lot. With great
success, and not a single disaster happened to me. Statistics for my case
is: between a dozen and two dozens of hardware RAIDs during at least
decade and a half. Some that are still are in production are over 10 years
old. My favorite 3ware, alas, was eradicated by competitors, second
favorite is Areca, next will be LSI, and it is not a most favorite as it
has horrible (confusing!) command client interface.

Sometimes people come from different places and tell "hardware RAID
horror" stories. After detailed review, all of them boil down to either or
all of:

1. RAID was not set up correctly. Namely: there were no surface scan
(scrub, or similar) scheduled to happen. Monthly would be enough, I
usually schedule it weekly. I will not go into detail how it leads to
problem, it's been described many times;

2. notification to sysadmin about failed drive, lost redundancy of RAID is
not arranged (which is as well incorrectly configured RAID)

3. inappropriate drives are used. The worst for RAID are "green" drives
that spin down to conserve power. While they spin up when request from
RAID card comes, they just time out...

4. Enabling cache, while not having battery backup that keeps cache RAM
with all its data in case of power outage

...

I've heard many time people bashing hardware RAID in favor of software
RAID. And everybody who is in favor of software RAID ignored the
following. Software RAID runs as process(es) under the main system.
Hardware RAID runs under very slim system on its own dedicated hardware
(CPU, RAM inside RAID card). The difference is: the second is way more
robust; being really small code it is much less likely to have bugs. If
the [main] system locks up, software RAID never finishes its function, and
whereas there are mechanisms for filesystem to be recovered from similar
situation, (software) RAID does not have good mechanisms to get out of it
with minimal losses (someone correct me if I'm wrong here). Hardware RAID
will continue and finish its tasks even if the [main] system got locked up
with kernel panic: it is independent of system that runs on the machine.
Similarly with sudden power loss, if power is restored  within reasonable
time hardware RAID will loose much less (if any) as opposed to software
RAID.

All in all hardware RAID is one of well known general ways of increasing
reliability of sophisticated system: instead of designing one big
sophisticated system one splits it into independent smaller and simpler
functioning units each of them keeps doing its task even if some other
unit failed (next step will be redundancy, which I'm leaving out as it
does not relate to software/hardware RAID comparison).

Just my $0.02

Valeri

>
> People are using stupid operating systems (i.e. not FreeBSD or Solaris)
> and the HBA acts as a volume manager. The OS is clueless when a drive goes
> flaky as all it sees is one big drive, and the first the know if a problem
> is when the final one croaks. (Who is there to see the identity lamp
> flashing red).
>
> Companies pay lots of money for this kit (to run Windows) and believe what
> they were told when they bought it. The more kit costs, the more lusers
> trust it.
>
> </rant>
>
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
> _______________________________________________
> freebsd-questions at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to
> "freebsd-questions-unsubscribe at freebsd.org"
>


++++++++++++++++++++++++++++++++++++++++
Valeri Galtsev
Sr System Administrator
Department of Astronomy and Astrophysics
Kavli Institute for Cosmological Physics
University of Chicago
Phone: 773-702-4247
++++++++++++++++++++++++++++++++++++++++


More information about the freebsd-questions mailing list