[CFT] Patch to bsdinstall to support root-on-ZFS and GELI
Johan Broman
johan at bridgenet.se
Sat Oct 19 20:27:27 UTC 2013
On 19/10/13 20:52, Teske, Devin wrote:
>
> On Oct 19, 2013, at 10:07 AM, Johan Broman wrote:
>
>>
>>
>> On 19/10/13 18:27, Allan Jude wrote:
>>> On 2013-10-19 11:55, Teske, Devin wrote:
>>>> On Oct 19, 2013, at 8:43 AM, Teske, Devin wrote:
>>>>
>>>>> On Oct 19, 2013, at 8:34 AM, Allan Jude wrote:
>>>>>
>>>>>> On 2013-10-19 11:31, Johan Broman wrote:
>>>>>>>
>>>>>>> On 19/10/13 17:23, Allan Jude wrote:
>>>>>>>> On 2013-10-19 10:56, Johan Broman wrote:
>>>>>>>>> Hi!
>>>>>>>>>
>>>>>>>>> Just tested the root-on-ZFS install option using FreeBSD 10 beta 1. I
>>>>>>>>> have 4 SATA drives in my server. I select all four of them in a RAIDZ1
>>>>>>>>> setup. I hit enter to continue the installation and the zpool is
>>>>>>>>> created, but I'm then returned to the zpool selection screen again. It
>>>>>>>>> turned out that two of the drives had previously been used in a
>>>>>>>>> (Linux) software mirror setup and because of this they got activated
>>>>>>>>> in /dev/raid/r0. Because of this I ended up in an endless bsdinstall
>>>>>>>>> loop.
>>>>>>>>>
>>>>>>>>> Removing the raid device using the graid command resolved the
>>>>>>>>> situation.
>>>>>>>>>
>>>>>>>>> Now maybe this is working as designed, but there was no warning/alert
>>>>>>>>> to the fact that the devices couldn't be used. Perhaps a warning
>>>>>>>>> should be rasied in this situation?
>>>>>>>>>
>>>>>>>>> Thanks for all the great work on the new installer, really looking
>>>>>>>>> forward to FreeBSD 10!
>>>>>>>>>
>>>>>>>>> Cheers
>>>>>>>>> Johan
>>>>>>>>> _______________________________________________
>>>>>>>>> freebsd-current at freebsd.org mailing list
>>>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
>>>>>>>>> To unsubscribe, send any mail to
>>>>>>>>> "freebsd-current-unsubscribe at freebsd.org"
>>>>>>>> Errors like that normally generate a msgbox dialog with the error output
>>>>>>>> from whichever command failed. I'll have to dig into it and see where
>>>>>>>> that problem is. I've seen other people have problems creating ZFS
>>>>>>>> arrays after graid, but in that case it was an incomplete graid label
>>>>>>>> causing a device to be locked but not appear in the graid status output.
>>>>>>>>
>>>>>>> Ah ok. A msgbox did appear but the drives that had the problem (ada2
>>>>>>> and ada3) wasn't visible in the output. (not sure if the box itself
>>>>>>> has a size limit or maybe I was just unable to scroll down and see the
>>>>>>> errors?). The only visible output was that it was able to create
>>>>>>> labels on ada0 and ada1.
>>>>>>>
>>>>>>> /Johan
>>>>>>> _______________________________________________
>>>>>>> freebsd-current at freebsd.org mailing list
>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
>>>>>>> To unsubscribe, send any mail to
>>>>>>> "freebsd-current-unsubscribe at freebsd.org"
>>>>>> Ahh yes, you have to press 'page-down' to scroll the msgbox. I tried to
>>>>>> add a scrollbar but turns out that is not possible.
>>>>>>
>>>>>> The only indication that there is more message to read, is a small 'xx%'
>>>>>> in the bottom right. We might have to look at breaking that output up or
>>>>>> something.
>>>>>>
>>>>>
>>>>> The only reason for a msgbox widget to scroll is if it is displayed at
>>>>> maximum height or width of the screen and it *still* has more data
>>>>> to display than can be presented at-once.
>>>>>
>>>> I should clarify...
>>>>
>>>> The zfsboot script doesn't use dialog(1) directly. It uses the bsdconfig API.
>>>> That being said, msgbox widgets automatically scale their size to fit the
>>>> content being displayed. So whenever a msgbox is thrown up using this
>>>> API... the widget will never scroll unless the box can't be made big enough
>>>> to hold the entire content (either the screen resolution or terminal size is
>>>> too small; we maxed out the size of the widget; and there's still hidden
>>>> content).
>>>>
>>>> But...
>>>>
>>>> While all of bsdconfig uses this API, hardly any of bsdinstall uses this API.
>>>>
>>>>
>>>>
>>>>> If... however... the msgbox widget is *not* full-height or full-width
>>>>> yet... it is requiring you to scroll -- then we've found a bug.
>>>>>
>>>>> Can we get a screen shot?
>>>> So we really need to nail down precisely which error box this is so that
>>>> we can address whether the issue is in-fact an instance of using the old
>>>> error-box handling instead of the auto-sizing API.
>>>>
>>>> So...
>>>>
>>>> With this described API, you should never have to scroll a box unless it
>>>> can't fit all the data *and* you should be able to immediately identify when
>>>> that becomes the case...
>>>>
>>>> 1. The widget spans the entire width of the screen.
>>>>
>>>> 2. The widget spans the entire height of the screen.
>>>>
>>>> 3. Both 1 and 2.
>>>>
>>>> It's in *those* cases that you should then *EXPECT* to find that the
>>>> region can scroll with cursor keys and page up/down (look for the
>>>> scroll percentage in the widget as Allan suggested.
>>>>
>>>> I don't want to see the scroll percentage doohickey *unless* the widget
>>>> is auto-sized to full-width or full-height. Meaning, there's either a bug in
>>>> the API or someone fell into a trap (there are a couple).
>>>
>>> the error output msgbox is huge, probably 100+ lines (the screen is
>>> what, 24 lines high, and with the ok button, top and bottom reserved
>>> space etc, can display maybe 18 lines at once)
>>>
>>> It contains all the shell output from everything we do, creating the
>>> gparts, setting up gnop, all of the redundant destroys etc.
>>>
>>> I don't think the TINY little % in the bottom right is really enough of
>>> an indicator to the user that they CAN scroll, let alone HOW to scroll
>>> (IIRC the arrow keys do not work, must use page down)
>>>
>>
>> I recreated the graid mirror on ada2 and ada3 and reran the installation. I'm unable to scroll the msgbox using PgDn or arrow keys. There is no indication that the action failed and I'm returned to the ZFS setup screen if I hit OK.
>>
>> I have screen shots (taken with my phone) of the msgbox and "ps auxwww" output. Let me know what kind of debug info you would like. I've put the screen shots here:
>>
>> http://212.181.212.146/bsdinstall
>>
>
> It looks like one of the commands that is used to partition
> the disks is producing an error status on exit but ... no error?
>
> Double-check me on this, but...
>
> 1. It looks to me like this is what you're seeing (code-wise):
>
> From http://svnweb.freebsd.org/base/head/usr.sbin/bsdinstall/scripts/zfsboot?revision=256553&view=markup
>
> 989 if ! error=$( zfs_create_boot "$ZFSBOOT_POOL_NAME" \
> 990 "$vdev_type" $real_disks 2>&1 )
> 991 then
> 992 f_dialog_msgbox "$error"
> 993 f_interactive || f_die
> 994 continue
> 995 fi
>
Yep, that looks like it. So maybe the command appears to go ok and hence
no stderr output?
> 2. That looks like our guy; f_dialog_msgbox() will use the
> currently-active dialog title.
>
> NOTE: This should probably be changed to be more clear in
> several ways. First, drop stdout to /dev/null keeping only stderr.
> Second, probably use ${error:-Unknown error has occurred}
> so that if some program returns error but doesn't produce an
> error message... we have some sensible fallback; Last, but not
> least, change the title to "Error" and put some prefix before the
> error text (with aforementioned fallback).
Ah yes. Makes sense to have a fallback.
>
> 3. Diving into the "zfs_create_boot" function... and further, the
> "zfs_create_diskpart" function...
>
> There are any number of reasons why you would get thrown
> back to the ZFS Configuration menu. A few are listed below:
>
> Inability to write to $BSDINSTALL_TMPETC/fstab
>
> You've specified an invalid swapsize.
> NB: Therein lies another problem ... we don't catch the error
> from f_expand_number and then tell you why a custom swap
> size is perhaps invalid.
>
> "gpart create -s gpt $disk" failed
> "gpart destroy -F $disk" failed
> NB: This is irrespective of whether you chose MBR or GPT; it's
> a bug that should be fixed (we shouldn't return failure on the
> pre-cursory destruction of existing data.
>
> etc...
>
> So thank you !! looks like I've got some patching to do to
> improve the debugging.
>
np! :)
Cheers
Johan
More information about the freebsd-current
mailing list