svn commit: r219667 - head/usr.sbin/bsdinstall/partedit

Doug Barton dougb at FreeBSD.org
Mon Mar 28 23:22:55 UTC 2011


On 03/21/2011 00:33, Jeff Roberson wrote:
> On Sun, 20 Mar 2011, Doug Barton wrote:
>
>> On 03/20/2011 09:22, Marius Strobl wrote:
>>
>>> I fear it's still a bit premature for enable SU+J by default. Rather
>>> recently I was told about a SU+J filesystems lost after a panic
>>> that happend after snapshotting it (report CC'ed, maybe he can
>>> provide some more details) and I'm pretty sure I've seen the problem
>>> described in PR 149022 also after the potential fix mentioned in its
>>> feedback.
>>
>> +1
>>
>> I tried enabling SU+J on my /var (after backing up of course) and
>> after a panic random files were missing entirely. Not the last updates
>> to those files, the whole file, and many of them had not been written
>> to in days/weeks/months.
>>
>
> So you're saying the directory entry was missing?

I'm saying that the file wasn't visible to 'ls /var/db/pkg/foo/'. I 
didn't debug it past determining that the files were missing.

> Can you tell me how big the directory was?

Most of the damage was in /var/db/pkg/, so the individual directories 
that were missing files were small, no more than 10 files each. I 
imagine there was probably other damaged scattered throughout /var, but 
once I learned how many files were missing I just nuked it and restored 
from backup.

> Number of files?

I stopped counting around 20 or so.

> Approximate directory size when
> you consider file names? When you fsck'd were inodes recovered and
> linked into lost and found?

No.

> What was the actual path?

To the lost files? The ones that I actually noticed missing were all 
/var/db/pkg/*/+CONTENTS. There were probably a lot of other files 
missing, but those were noticeable because the ports tree was throwing 
errors, and a missing +CONTENTS file can't be recovered from without 
re-installing the port.

> I'm trying to wrap my head around how this would be possible and where
> the error could be and whether it could be caused by SUJ.

It never happened before enabling SUJ, happened shortly after I did, and 
has never happened since I disabled it.

It's probably worth reiterating that the damage happened after an actual 
panic, as opposed to during "regular" operation.

> The number of
> interactions with disk writes are minimal. Corruption if it occurs would
> most likely be caused by a bad journal recovery.

Unlikely in this case, since the damage was not confined to 
recently-written files.


hth,

Doug

PS, my primary concern was that we not enable this by default until it 
can be demonstrated to be more robust. However Nathan has already 
enabled it in the new installer, so now perhaps it would be fitting to 
send a message to -current letting people know that the plan is to have 
it on by default in 9.0, and asking people to resume more rigorous testing.

-- 

	Nothin' ever doesn't change, but nothin' changes much.
			-- OK Go

	Breadth of IT experience, and depth of knowledge in the DNS.
	Yours for the right price.  :)  http://SupersetSolutions.com/



More information about the svn-src-all mailing list