raid1 + degraded (take out one disk) + fatal trap 12 on next reboot

Ted Mittelstaedt tedm at toybox.placo.com
Sat May 3 16:10:00 UTC 2008


Roberto,

  You can't simulate a disk drive failure that way.  If
you really want to know what's going on, the issue is that
your pointing the swap to ar0.  If you must get this
booted again you can try booting into single user mode
and editing /etc/fstab and pointing the partitions to
/dev/ad0 instead of /dev/ar0 and booting.  But this is
an emergency action and is not recommended.

  If you want to simulate a drive failure, WHILE THE
SYSTEM IS RUNNING pull the SATA connector on one drive.

  The system should NOT trap, it should simply print a
error to the console and show it's gone into degraded mode.

  If you then reboot, the system may or may not come back up.

  You have to understand the approach of RAID mirroring.  Basically
this is poor-man's data protection.  The idea is that a disk 
usually fails in the middle of the day during the worst possible
time.  When it does you do NOT want the server to stop or
crash.  You want it to keep running until the evening when you
can spend a couple hours getting the disk replaced.  (or until
the next day when you can buy a replacement drive)

  When you have the replacement disk ready to plug into the
system, you are supposed to run a full backup of your data
on the degraded array just in case the reinsertion goes badly.

  I have found the safest is to leave the server alone and
get the replacement disk ready.  Wiping it in another system
with dd if=/dev/zero of=/dev/ad1 bs=50k is the best policy
before reinsertion.

  Follow the steps in the man page for reinsertion.  Keep in
mind that they don't always work.  If they don't then you will
have to wipe both disks and regenerate the array and reinstall
the OS.  That is why you make a backup first when the system is
off-duty.

Ted

> -----Original Message-----
> From: owner-freebsd-questions at freebsd.org
> [mailto:owner-freebsd-questions at freebsd.org]On Behalf Of Roberto Nunnari
> Sent: Tuesday, April 29, 2008 12:35 PM
> To: freebsd-questions at freebsd.org
> Subject: Re: raid1 + degraded (take out one disk) + fatal trap 12 on
> next reboot
> 
> 
> Nobody on this, please? :)
> 
> 
> Roberto Nunnari wrote:
> > Hi all!
> > 
> > I'm playing with new HW and FreeBSD 6.3 and 7.0.
> > 
> > I set up raid 1 on two sata disks (fakeraid on ICH9R)
> > and as long as I can see, it seams to work very well.
> > 
> > Now I'm trying to simulate 1 disk failure (I just take
> > out a disk and boot again). Doesn't matter which of the
> > two disks I take out, the bios correctly shows the raid
> > as degraded and bootable, loads the FreeBSD loader, who
> > loads the kernel and starts the boot.
> > But when the kernel comes to the drives (or the swap?)
> > it fatal traps 12. The trap descriptions sais that
> > current process is 0 (swapper).
> > 
> > Reading that I commented out the swap partition from fstab,
> > but that doesn't help.
> > 
> > How can I get the system to finish the boot?
> > 
> > Thank you and best regards.
> > 
> 
> _______________________________________________
> freebsd-questions at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to 
> "freebsd-questions-unsubscribe at freebsd.org"
> 


More information about the freebsd-questions mailing list