Making world but no kernel

Jerome Herman jherman at dichotomia.fr
Tue Jul 26 12:50:06 UTC 2011


On 26/07/2011 13:44, Jeremy Chadwick wrote:
> On Tue, Jul 26, 2011 at 01:04:04PM +0200, Jerome Herman wrote:
>> I would like to know if it is possible to rebuild world, but without
>> upgrading or even compiling the kernel.
>>
>> The problem is such : I am presently working on a FreeBSD station
>> that seems to have quite a lot of problem, notably with fsck. I am
>> starting to wonder whether this BSD station was properly installed,
>> or if some of the system tools were pasted from older FreeBSD setup.
>> Since the machine is in a remote location, I would prefer to avoid
>> full reinstall if possible. Among other things, single user mode is
>> not available.
>>
>> So I was wondering, if I get the full sources with sysinstall, can I
>> make buildworld and then installworld without going through the
>> kernel phase or would this be a bad idea ?
> Is it possible?  Yes.  Is it a bad idea?  Generally yes.  World and
> kernel effectively need to be "in sync"; some kernel binary structures
> (particularly for things like libkvm) need to be what userland binaries
> expect them to be.  Nobody will be able to provide any support for this
> configuration.
I think kernel and world are already out of sync. This machine is a 
pre-installed BSD from an ISP, and I have no clues as to how it was 
done. But I suspect that world was not built or rebuilt properly.
I of course got the sources that matches my kernel, and plan to 
reinstall world just to make sure it is in sync with kernel.
>
> If you're trying to do things ""in phases"" because of this "fsck
> problem" (see below for more on that), then please be sure that after
> you rebuild world and reinstall world, that you DO NOT empty out
> /usr/obj before rebuilding kernel/reinstalling kernel.  The kernel build
> does refer to things in /usr/obj which were built as a result of
> buildworld.
Yes I know the entire compilation chain is in /usr/obj for make kernel. 
So I won't touch it until I can see clearer on this box.

>
> All that said: can we please get some deeper insight as to this
> "problems with fsck" you're referring to?  I'm of the strong opinion
> that it's better to try and solve the root cause of an issue than do
> "hackish stuff" like the above (though it's not that hackish, you get
> what I mean I hope).  I don't understand how fsck would cause you a
> problem unless the machine is constantly losing power or has serious
> issues with its storage.
Neither one, nor the other. I have a gvinum setup for data disks. After 
a forced reboot due to power failure, the box would not come up. Booting 
into rescue drive I realized that it refused to boot because it could 
not mount the data partition (/dev/gvinum/data), and this in turn 
because fsck would not work on the said partition.
So I turned off daemons, removed /dev/gvinum/data from fstab and booted 
again.

No problems.

Tried to fsck /dev/gvinum/data and got
fsck: Could not determine filesystem type

fsck_ufs /dev/gvinum/data got stuck on phase 1 for 8 hours before I 
hard-canceled it.

trying to mount the drive resulted in
mount: /dev/gvinum/data : Operation not permitted

gvinum list giving the following informations :

3 drives:
D c                     State: up       /dev/ad7        A: 1/1430799 MB (0%)
D a                     State: up       /dev/ad5        A: 1/1430799 MB (0%)
D b                     State: up       /dev/ad6        A: 1/1430799 MB (0%)

1 volume:
V data                State: up       Plexes:       2 Size:       2095 GB

2 plexes:
P data.p0           S State: up       Subdisks:     3 Size:       2095 GB
P data.p1           S State: up       Subdisks:     3 Size:       2095 GB

6 subdisks:
S data.p0.s0          State: up       D: a            Size:        698 GB
S data.p0.s1          State: up       D: b            Size:        698 GB
S data.p0.s2          State: up       D: c            Size:        698 GB
S data.p1.s0          State: up       D: c            Size:        698 GB
S data.p1.s1          State: up       D: a            Size:        698 GB
S data.p1.s2          State: up       D: b            Size:        698 GB


The I did a newfs on the drive, which went well, and I was able to mount 
it again without any problem.

Still testing I decided to umount the drive and to use fsck on it.
Same problems came back. Unable to fsck simply, fsck_ufs getting stuck 
on phase 1 and mount returning "operation not permitted".
Newfs again - no problems.  Mount again - no problems.

Destroyed the gvinum drive, made every disk into a standard UFS drive 
and fsck on each of them : no problems.

Tried to create FreeBSD partition with gvinum slices instead of using 
disk directly : same old, same old.

So here I am starting to think that my disklabel and fsck are not in 
sync with my kernel.

>
> Are you sure the problem, for example, isn't with the underlying storage
> device (disk)?  If you aren't sure, would you like to verify that's
> not the problem piece?  If so, please post some details like:
>
> * dmesg
> * Contents of /etc/fstab
> * sysctl kern.disks
>
> If the disks are backed by ata(4):
>
> * atacontrol list
> * atacontrol cap XXX   (where XXX = each disk shown in kern.disks)
>
> If the disks are backed by ada(4) or are SCSI (da(4)):
>
> * camcontrol devlist
> * For ada(4) disks only: camcontrol identify XXX
> * For da(4) disks only:  camcontrol inquiry XXX
>
> And regardless of if ata(4), ada(4), or da(4):
>
> * smartctl -a /dev/XXX (where XXX = each disk shown in kern.disks; this
>    will require you install ports/sysutils/smartmontools first)
>
> I can assist with the disk analysis portion in particular.
>
> And with regards to smartctl, please try to ensure the output doesn't
> get munged (forced line wrapping, newlines injected, etc.).  It makes it
> more difficult to read.  Put the output up on the web if you're worried
> about this.
>



More information about the freebsd-stable mailing list