Boot0cfg bug redux (Re: sys/boot/boot0/boot0.S - r186598)

Luigi Rizzo rizzo at iet.unipi.it
Mon Jan 10 23:19:09 UTC 2011


In order to understand the bug discussed in the recent thread
(original message attached at the end), Tom Judge passed me the
dump of the boot sector around the bug.

The system giving trouble has the following configuration

Fresh transcript:
file1: ORIGINAL BOOT SECTOR
    # boot0cfg -v ad0
    #   flag     start chs   type       end chs       offset         size
    1   0x80      0:  1: 1   0xa5    494: 15:63           63       498897
    2   0x00    495:  1: 1   0xa5    989: 15:63       499023       498897
    3   0x00    990:  0: 1   0xa5    992: 15:63       997920         3024


file2: boot sector after running 'boot0cfg -s 2 -v ad0'
    > cmp -x file1 file2
    000001b5 00 01              _OPT, default option

    No big surprises here, the default selection changes from 0 to 1
    HOWEVER, boot0cfg does not alter the 'active' flag in the
    partition table. This triggers, if i remember well, a 'feature'
    in the boot1/boot2, code which does not know/honor the selected
    partition and instead boots the first partition marked as 'active',
    and missing that, the first FreeBSD partition.

As a consequence, if we reboot without pressing an F-key, the system
boots from partition s1 even though the boot loader indicates F2.

file3: boot sector after the above reboot
    > cmp -x file1 file3
    000001b5 00 01

Next, reboot this time pressing F2. After the boot we start from s2,
and the boot sector is now changed:

file4: boot sector after pressing F2

    > cmp -x file1 file4
    000001b4 00 b1              _NXTDRV
    000001b5 00 01              _OPT, default option
    000001be 80 00              active flag, slice 1
    000001ce 00 80              active flag, slice 2

As expected the 'active' flag is updated as a result of a boot from 
the partition selected. 
This is something that could be done by 'boot0cfg -s ...' 
to achieve the desired behaviour.

The only "surprise" here is that _NXTDRV has changed. I am unsure 
if this was the result of an erroneous F5 keypress. Indeed 0xb1 is
probably the correct initial value of the byte at 0x1b4, probably
I/we forgot to initialize the field.


So, to summarize, I guess that a possible fix (that does not involve
using gpart, or even worse, modifying boot0.S, which probably does
not have any spare space) is to modify boot0cfg so that it sets the
'active' flag for the partition corresponding to the default entry.

What do people think ?

cheers
luigi

On Sun, Jan 09, 2011 at 12:39:28AM -0600, Tom Judge wrote:
> Hi,
> 
> Today I ran into an issue where setting the default slice with boot0cfg
> -s is broken.
> 
> This is related to a section of this revision:
> 
> + commit Warner's patch "orb $NOUPDATE,_FLAGS(%bp)"
>   to avoid writing to disk in case of a timeout/default choice;
> 
> This issue is quite well documented in bin/134907 which has been open
> since May 2009.
> 
> Reproduced with a fresh nanobsd build:
> 
> Boot 1 - Slice 1 active as set by nanobsd image builder:
> 
> ===
> # boot0cfg -v ad0
> #   flag     start chs   type       end chs       offset         size
> 1   0x80      0:  1: 1   0xa5    494: 15:63           63       498897
> 2   0x00    495:  1: 1   0xa5    989: 15:63       499023       498897
> 3   0x00    990:  0: 1   0xa5    992: 15:63       997920         3024
> 
> version=2.0  drive=0x80  mask=0x3  ticks=182  bell=# (0x23)
> options=packet,update,nosetdrv
> volume serial ID 9090-9090
> default_selection=F1 (Slice 1)
> ===
> 
> Update the active slice to 2:
> ===
> # boot0cfg -s 2 -v ad0
> #   flag     start chs   type       end chs       offset         size
> 1   0x80      0:  1: 1   0xa5    494: 15:63           63       498897
> 2   0x00    495:  1: 1   0xa5    989: 15:63       499023       498897
> 3   0x00    990:  0: 1   0xa5    992: 15:63       997920         3024
> 
> version=2.0  drive=0x80  mask=0x3  ticks=182  bell=# (0x23)
> options=packet,update,nosetdrv
> volume serial ID 9090-9090
> default_selection=F2 (Slice 2)
> ===
> 
> Reboot and let boot0 time out and boot default slice 2:
> ===
> # boot0cfg -v ad0
> #   flag     start chs   type       end chs       offset         size
> 1   0x80      0:  1: 1   0xa5    494: 15:63           63       498897
> 2   0x00    495:  1: 1   0xa5    989: 15:63       499023       498897
> 3   0x00    990:  0: 1   0xa5    992: 15:63       997920         3024
> 
> version=2.0  drive=0x80  mask=0x3  ticks=182  bell=# (0x23)
> options=packet,update,nosetdrv
> volume serial ID 9090-9090
> default_selection=F2 (Slice 2)
> ===
> The system actually booted into slice 1 here.
> This was verified by dropping to the loader prompt and using show to grab:
> loaddev=disk0s1a:
> 
> Reboot and hit 2 at the boot0 prompt:
> ===
> # boot0cfg -v ad0
> #   flag     start chs   type       end chs       offset         size
> 1   0x00      0:  1: 1   0xa5    494: 15:63           63       498897
> 2   0x80    495:  1: 1   0xa5    989: 15:63       499023       498897
> 3   0x00    990:  0: 1   0xa5    992: 15:63       997920         3024
> 
> version=2.0  drive=0x80  mask=0x3  ticks=182  bell=# (0x23)
> options=packet,update,nosetdrv
> volume serial ID 9090-9090
> default_selection=F2 (Slice 2)
> ===
> 
> This time we really boot into slice 2.
> 
> The attached patch backs out the relevant part of r186598.
> 
> There was a post on the embedded list that suggested this work around:
>     echo 'a 2' | fdisk -f /dev/stdin ad0
>     boot0cfg -s 2 ad0
> 
> There are 2 issues with this:
> 1) It can't be done without setting kern.geom.debugflags to 0x10.
> 2) It resulted in most/all commands resulting in the error message
> "Device not configured" including the second command and 'shutdown -r now'.
> 
> Both of which leave this really work around fairly broken.
> 
> 
> Tom
> 

> Index: boot0.S
> ===================================================================
> --- boot0.S	(revision 213760)
> +++ boot0.S	(working copy)
> @@ -373,7 +373,6 @@
>  	 * Timed out or default selection
>  	 */
>  use_default:	movb _OPT(%bp),%al		# Load default
> -		orb $NOUPDATE,_FLAGS(%bp) 	# Disable updates
>  		jmp check_selection		# Join common code
>  
>  	/*





More information about the freebsd-hackers mailing list