Vinum, replaced disk -- fsck error.

Greg 'groggy' Lehey grog at FreeBSD.org
Fri Mar 19 14:27:44 PST 2004


On Friday, 19 March 2004 at  3:03:34 +0000, Lewis Thompson wrote:
> On Thu, Mar 18, 2004 at 01:26:02PM +1030, Greg 'groggy' Lehey wrote:
>> On Tuesday, 16 March 2004 at 17:25:26 +0000, Lewis Thompson wrote:
>>> I can't think of anything else.  Originally I ran dd without the
>>> conv=noerror and it stopped at around 25GB (the disk is a 100GB).  The
>>> destination disk is 123GB but to my knowledge that is acceptable for dd.
>>>
>>>   During the process a number (maybe eight to ten) I/O errors were
>>> reported.
>>
>> But not to me.
>
> I've included more detailed errors neared to the end of this email :)
>
>> I was really thinking of "What to do if you have problems with Vinum"
>> at http://www.vinumvm.org/vinum/how-to-debug.html.
>
> Okay, I did actually do my best to follow this but maybe got
> sidetracked.  I'm just going to bullet point these now so I don't miss
> any of them out.
>
> * Problems: ``dd'' cloned disk ``does not work'' (i.e. gstat shows no
>   activity on the cloned disk during reading of files).

What command did you enter?  What happened on the command line?

> Also see previous emails.

You can't really expect me to go digging for them.

> * Changes to system: Originally vinum ran on 4.9-STABLE.  This worked
>   but had periodic ``disk crashes'' (i.e. vinum states disk as offline).
>   I don't think this is the problem as the same behaviour happens with
>   5.2.1-p1 using the original dodgy disk (only GEOM removes it instead
>   of vinum).

It looks like part of the problem to me.  It seems that you have a
flaky disk.  Is that correct?

> * Vinum list (excuse lack of wrapping).

On the contrary, it shouldn't be wrapped.

I don't see anything in this list that hasn't started.  Is it correct
that volume "data" only has one plex?

> * /var/log/messages extract.  I originally started vinum a long while
>   before, I included this entry too (excuse wrapping):
>
> Mar 17 23:33:57 amnesia kernel: vinum: loaded
> Mar 17 23:34:00 amnesia kernel: vinum: reading configuration from /dev/ad1s1h
> Mar 17 23:34:00 amnesia kernel: vinum: updating configuration from /dev/ad2s1h
> Mar 17 23:34:00 amnesia kernel: vinum: updating configuration from /dev/ad3s1h
> Mar 19 02:49:26 amnesia kernel: WARNING: /mnt/data was not properly dismounted
> Mar 19 02:52:15 amnesia kernel: vinum: null rqg
>
>   This seems a little odd to me -- previously I had not had a null rqg
> error.

This is certainly an interesting one.

> I think maybe I didn't test it enough.  Since these are mostly avi
> files I can tell if they are broken on not by seeing if they have an
> index -- last time they all played but many without indexes.
> Nothing has changed since then; maybe I wasn't being thorough
> enough?

I'm wondering if the problem isn't at least partially due to the flaky
disk.  The "null rqg" message indicates that a request couldn't be
mapped.  I'd really need a dump from this point, if the problem is
repeatable.  Let me know and I'll send you a patch.

>>>   During the process a number (maybe eight to ten) I/O errors were
>>> reported.
>
> These were dd errors.  I didn't write these down at the time (silly of
> me) and I'm not sure they even go into any log files.  However, I have
> found the exact error messages I got (although the offsets are wrong).
> If required I will re-run dd and provide the full errors.
>
>   The messages were:
>
> dd: reading `/dev/ad3': Input/output error

Hmm.  Why are you running against /dev/ad3?  Why are you using dd at
all?  In any case, I would expect error messages in /var/log/messages
at this point.

> In a reply to my original question you stated that ``dd if=ad3 of=ad1
> bs=8192 conv=noerror'' ``may or may not work, depending on details you
> haven't reported.''  Do these detailed errors help at all?

A little.  They tell me that the drive is flaky.  I'd expect to see
the error messages in /var/log/messages, though.

> I just read a thread[1] about dd that makes me wonder whether it
> would have been.

The only reference I see there is to the current thread.  At least it
gave me the background.

>   I think that's everything.  I'm just going to include some other stuff
> from earlier emails that has been chopped earlier.  Maybe it has some
> relevance:
>
> = fsck_ufs /dev/vinum/data gives the following message:
> = ** /dev/vinum/data
> = cannot alloc 4316869296 bytes for inphead

Yes, this is almost certainly due to incorrect copying.  Probably the
conv=noerror is to blame for that.

I suspect that, unless you can read the sections of the volume that
appear to be causing the fsck problems, you may be out of luck.  About
the only thing you could try is to mount the volume read-only without
fsck, and then copy what data you can elsewhere.

Greg
--
When replying to this message, please copy the original recipients.
If you don't, I may ignore the reply or reply to the original recipients.
For more information, see http://www.lemis.com/questions.html
Note: I discard all HTML mail unseen.
Finger grog at FreeBSD.org for PGP public key.
See complete headers for address and phone numbers.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-questions/attachments/20040320/9247b70c/attachment.bin


More information about the freebsd-questions mailing list