zfs booting feedback

Kurt Lidl lidl at pix.net
Thu Jul 12 17:22:11 UTC 2012


On Thu, Jul 12, 2012 at 03:02:56PM +0800, Gavin Mu wrote:
> On Wed, Jul 11, 2012 at 12:54 AM, Kurt Lidl <lidl at pix.net> wrote:
> 
> > On Mon, Jul 09, 2012 at 04:00:19PM +0200, Marius Strobl wrote:
> > > On Sat, Jul 07, 2012 at 10:54:35PM -0400, Kurt Lidl wrote:
> > > > I built a full 9.0-stable distribution on Friday night, and got to play
> > > > with installing it on a spare Netra T1-105 today.  Mostly I was
> > > > interested in testing out the integrated ZFS boot support that
> > > > was commited recently.
> > > >
> > > > First of all -- it works!  Thanks very much to all who made it
> > possible!
> > > >
> > > > After working through a couple of nits in my script that installs it
> > all,
> > > > I've got a fully functioning, ZFS-only sparc64 machine.  Nice.
> > > >
> > > > The zfsboot bootblock's warning about not being able to open
> > non-existant
> > > > devices are pretty extranous, but other than that, it seems to
> > function OK.
> > >
> > > That's more or less a cosmetic problem for now; there's no standard
> > > Open Firmware method allowing to test whether the device corresponding
> > > to a (automatically) created device alias actually exists short of
> > > trying to open it, with OFW causing at least the "Drive not ready"
> > > part on its own. There are some Sun specific extensions to the
> > > default methods whose names sound like they could be of some help
> > > here. I haven't gotten around to actually test whether this is the
> > > case or whether they actually exist in all OFW implementations of
> > > all sun4u models.
> > > If the aliases were artificially created via the `nvalias` command
> > > ("disk9" sounds a bit unusual for the automatically created ones)
> > > you can get rid of the none existing ones via `nvunalias` (needs
> > > a `reset-all` or power-cycle to take effect).
> >
> > All the disks that were probed were part of the normally
> > defined devices on the machine.  I only have two devices defined
> > in my nvramrc:
> >
> > ok nvramrc type
> > devalias rootdisk /pci at 1f,0/pci at 1,1/scsi at 2/disk at 0,0
> > devalias rootmirror /pci at 1f,0/pci at 1,1/scsi at 2/disk at 1,0
> >
> > And I have the system configured to boot from "rootdisk rootmirror".
> >
> > Here's the full output of a 'devalias' from the prom on the machine:
> >
> > ok devalias
> > cdrom1                   /pci at 1f,0/pci at 1,1/scsi at 2/disk at 6,0:f
> > cdrom                    /pci at 1f,0/pci at 1/pci at 1/ide at e/cdrom at 2:f
> > ide-disk                 /pci at 1f,0/pci at 1/pci at 1/ide at e/disk at 0:f
> > ide-cdrom                /pci at 1f,0/pci at 1/pci at 1/ide at e/cdrom at 2:f
> > ide                      /pci at 1f,0/pci at 1/pci at 1/ide at e
> > rootmirror               /pci at 1f,0/pci at 1,1/scsi at 2/disk at 1,0
> > rootdisk                 /pci at 1f,0/pci at 1,1/scsi at 2/disk at 0,0
> > userprom2                /pci at 1f,0/pci at 1,1/ebus at 1/flashprom at 10,800000
> > userprom1                /pci at 1f,0/pci at 1,1/ebus at 1/flashprom at 10,400000
> > i2c-cs2                  /pci at 1f,0/pci at 1,1/ebus at 1/i2c at 14,100000
> > i2c                      /pci at 1f,0/pci at 1,1/ebus at 1/i2c at 14,600000
> > systemprom               /pci at 1f,0/pci at 1,1/ebus at 1/flashprom at 10,0
> > pcic                     /pci at 1f,0/pci at 1/pci at 1
> > pcib                     /pci at 1f,0/pci at 1,1
> > pcia                     /pci at 1f,0/pci at 1
> > ebus                     /pci at 1f,0/pci at 1,1/ebus at 1
> > net2                     /pci at 1f,0/pci at 1,1/network at 3,1
> > net                      /pci at 1f,0/pci at 1,1/network at 1,1
> > floppy                   /pci at 1f,0/pci at 1,1/ebus at 1/fdthree
> > disk                     /pci at 1f,0/pci at 1,1/scsi at 2/disk at 0,0
> > cdrom                    /pci at 1f,0/pci at 1,1/scsi at 2/disk at 6,0:f
> > tape                     /pci at 1f,0/pci at 1,1/scsi at 2/tape at 4,0
> > tape1                    /pci at 1f,0/pci at 1,1/scsi at 2/tape at 5,0
> > tape0                    /pci at 1f,0/pci at 1,1/scsi at 2/tape at 4,0
> > diskf                    /pci at 1f,0/pci at 1,1/scsi at 2/disk at f,0
> > diske                    /pci at 1f,0/pci at 1,1/scsi at 2/disk at e,0
> > diskd                    /pci at 1f,0/pci at 1,1/scsi at 2/disk at d,0
> > diskc                    /pci at 1f,0/pci at 1,1/scsi at 2/disk at c,0
> > diskb                    /pci at 1f,0/pci at 1,1/scsi at 2/disk at b,0
> > diska                    /pci at 1f,0/pci at 1,1/scsi at 2/disk at a,0
> > disk9                    /pci at 1f,0/pci at 1,1/scsi at 2/disk at 9,0
> > disk8                    /pci at 1f,0/pci at 1,1/scsi at 2/disk at 8,0
> > disk7                    /pci at 1f,0/pci at 1,1/scsi at 2/disk at 7,0
> > disk6                    /pci at 1f,0/pci at 1,1/scsi at 2/disk at 6,0
> > disk5                    /pci at 1f,0/pci at 1,1/scsi at 2/disk at 5,0
> > disk4                    /pci at 1f,0/pci at 1,1/scsi at 2/disk at 4,0
> > disk3                    /pci at 1f,0/pci at 1,1/scsi at 2/disk at 3,0
> > disk2                    /pci at 1f,0/pci at 1,1/scsi at 2/disk at 2,0
> > disk1                    /pci at 1f,0/pci at 1,1/scsi at 2/disk at 1,0
> > disk0                    /pci at 1f,0/pci at 1,1/scsi at 2/disk at 0,0
> > scsi                     /pci at 1f,0/pci at 1,1/scsi at 2
> > ttyb                     /pci at 1f,0/pci at 1,1/ebus at 1/su at 14,3602f8
> > ttya                     /pci at 1f,0/pci at 1,1/ebus at 1/su at 14,3803f8
> > ttyd                     /pci at 1f,0/pci at 1,1/ebus at 1/se at 14,400000:b
> > ttyc                     /pci at 1f,0/pci at 1,1/ebus at 1/se at 14,400000:a
> >
> > As you can see, the devices disk0..diskf exist, but something in the
> > boot code "only" probes the first 10 devices.  It's certainly not
> > attempting to opening *all* the disk devices listed by 'devalias'.
> >
> > It looks like from the code in .../sys/boot/sparc64/loader/main.c
> > that the first MAXDEV (==31) disk devices are probed (well, whatever
> > disk%d is an alias to, I suppose) and the vtoc's
> > loaded and examined for zfs partitions.
> >
> > oops, I think I assumed that the disk name should be disk9, disk10,
> disk11, instead of disk9, diska, diskb...
> Is there any standards to name those disks?

I do not really know.  The above 'devalias' output is the same on
the two netra-T1 105s that I tested.  I looked on my SunFire V240,
and it has many fewer entries:

{1} ok devalias
usb                      /pci at 1e,600000/ide at d/disk
xnet2                    /pci at 1d,700000/pci at 1/SUNW,hme at 0,1:dhcp,
xnet1                    /pci at 1e,600000/pci at 3/SUNW,hme at 0,1:dhcp,
xnet                     /pci at 1e,600000/pci at 2/SUNW,hme at 0,1:dhcp,
net3                     /pci at 1d,700000/network at 2,1
net2                     /pci at 1d,700000/network at 2
net1                     /pci at 1f,700000/network at 2,1
net                      /pci at 1f,700000/network at 2
cdrom                    /pci at 1e,600000/ide at d/cdrom at 0,0:f
ide                      /pci at 1e,600000/ide at d
disk3                    /pci at 1c,600000/scsi at 2/disk at 3,0
disk2                    /pci at 1c,600000/scsi at 2/disk at 2,0
disk1                    /pci at 1c,600000/scsi at 2/disk at 1,0
disk0                    /pci at 1c,600000/scsi at 2/disk at 0,0
disk                     /pci at 1c,600000/scsi at 2/disk at 0,0
scsi                     /pci at 1c,600000/scsi at 2
sc-control               /pci at 1e,600000/isa at 7/rmc-comm at 0,3e8
ttyb                     /pci at 1e,600000/isa at 7/serial at 0,2e8
ttya                     /pci at 1e,600000/isa at 7/serial at 0,3f8
name                     aliases

I would argue that what the loader ought to be looking at the
devices/devalias entries values for the "boot-device" property.

That way, if I wanted to boot from something like a zmirror of
disk2 and disk3 on my sunfire, I would just set the
"boot-device" to be "disk2 disk3", and the zfs boot code would
just try to interate through those devices, rather than going
from 0..31 and trying disk%d...

If I had valid boot-code on disk0 and disk2, and I set the
"boot-device" to "disk2 disk3", I think current code will do
this:
	- prom load "zfsboot" block off disk2
	- zfsboot block loads in the zfsloader binary from current disk (disk2)
	- which then probes disk0, disk1 .... and finally boots
	the kernel from the first freebsd-zfs partition that it finds
	on any of those disks.

I think this is wrong, as there could be some data-only zfs
partition on disk0, which doesn't have a kernel to boot from...

Also, one other thing to keep in mind that the boot-device propery
can be a devalias entry or just a straight-up device specifier,
like this:

	/pci at 1c,600000/scsi at 2/disk at 0,0:a

(That's what I have on my SunFire, for various arcane reasons...)

I guess we also have to worry when someone breaks into the prom
and says "boot disk4", and that user input should override the
"boot-device" settings in the prom.

-Kurt


More information about the freebsd-sparc64 mailing list