big vinum problem

Greg 'groggy' Lehey grog at FreeBSD.org
Tue Oct 14 00:02:41 PDT 2003


On Monday, 13 October 2003 at 23:46:12 -0700, Octavian Hornoiu wrote:
> After a power loss last night i restarted my server with a 425 gig or so
> RAID-5 array and expected to go through a length fsck after which the
> system would come up.  However, one of the vinum subdisks was down.  So,
> i rebooted into single user mode, i restarted the home.p0.s3 subdisk and
> then i ran a manual fsck.  What followed was a series of hard errors
> that said:
>
> ad7s1e: hard error reading fsbn 86482817 of 43241337-43241448 (ad7s1 bn
> 86482817; cn 5383 tn 78 sn 8) status=59 error=40
> vinum: home.p0.s3 is crashed by force
> vinum: home.p0 is degraded
>  fatal: home.p0.s3 read error, block 43241337 for 57344 bytes
> home.p0.s3 user buffer block 30268632 for 57344 bytes
> ** Phase 2 - Check Pathnames
> ad11s1e hard error etc
>
> then home.p0.s7 becomes corrupt and crashes by force and then i find
> myself staring at a screen that says:
>
> CANNOT READ: BLK 297103054
> CONTINUE [yn]
>
>
> I have done this twice now and every time vinum successfully initializes
> the subdisk and the plex comes up and is in the "up" state but once i
> run fsck it crashes again.  What exactly can i do to remedy this.  If
> it's a bad disk i'll replace it but can't vinum work around bad blocks?

That depends on your configuration, which you haven't described.  Take
a look at http://www.vinumvm.org/vinum/how-to-debug.html.

> My system is FreeBSD 4.9 RC from RELEASE branch with all the latest
> patches, i'm fully up to date.  I have 8 subdisks in vinum
> home.p0.s0-s7 with a 55 GB partition on each drive used by vinum.
> All the drives are identical and all they contain is the vinum
> partitions.

It looks as if you have only one plex, then.  Vinum doesn't normally
recover from these problems.  It follows a slightly different policy
from UFS: if there are bad sectors on a subdisk, it doesn't trust the
entire subdisk.  There are ways around this, but they haven't been
committed.  Send me the information asked for on the web page and I'll
send you instructions on how to fix the problem.

This still means that you'll probably have to change the disk.

Greg
--
When replying to this message, please copy the original recipients.
If you don't, I may ignore the reply or reply to the original recipients.
For more information, see http://www.lemis.com/questions.html
See complete headers for address and phone numbers.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-questions/attachments/20031014/67c4c93b/attachment-0001.bin


More information about the freebsd-questions mailing list