Bizarre clone attempt failures on Raspberry Pi2...

Paul Mather paul at gromit.dlib.vt.edu
Fri Jul 15 16:33:17 UTC 2016


On Jul 15, 2016, at 11:51 AM, Ian Lepore <ian at freebsd.org> wrote:

> On Fri, 2016-07-15 at 09:44 -0500, Karl Denninger wrote:
>> On 7/15/2016 09:22, Paul Mather wrote:
>> 
>>> On Jul 15, 2016, at 9:44 AM, Karl Denninger <karl at denninger.net>
>>> wrote:
>>> 
>>>> On 7/15/2016 08:36, Paul Mather wrote:
>>>>> On Jul 14, 2016, at 11:36 PM, Karl Denninger <
>>>>> karl at denninger.net> wrote:
>>>>> 
>>>>>> Found it.
>>>>>> 
>>>>>> Apparently the current code *requires* the label be set on
>>>>>> the msdos
>>>>>> partition.  If it's not then not only does it not mount
>>>>>> (which shouldn't
>>>>>> matter post-boot as the loader is supposed to pass the dtb
>>>>>> file, it is
>>>>>> specified in the config file without any sort of path prefix,
>>>>>> and thus
>>>>>> once the kernel has loaded it should not matter if the dos
>>>>>> partition if
>>>>>> actually mounted or not) *but* the boot process hangs without
>>>>>> any
>>>>>> indication of why!
>>>>>> 
>>>>>> So, you must do newfs_msdos -L MSDOSBOOT -F 16 {device}
>>>>>> 
>>>>>> If the "-L" is missing you're hosed; the system facially
>>>>>> appears to be
>>>>>> just fine but while the loader comes up and so does the
>>>>>> kernel, it hangs
>>>>>> without ever proceeding -- and without any sort of error
>>>>>> message
>>>>>> indicating that it is unable to mount something it needs.
>>>>> You have to do that because the device entry in the stock
>>>>> /etc/fstab is /dev/msdosfs/MSDOSBOOT.  The /dev/msdosfs part
>>>>> indicates it's using ms-dos labels.  In other words, this is
>>>>> just the same sort of failure you were getting when you weren't
>>>>> labelling the UFS partition as "rootfs".  Labelling the file
>>>>> system properly "fixes" the issue, as you would expect.
>>>>> 
>>>>> It's a misnomer to say the code "requires" labels.  It's just
>>>>> that's the way the distribution images are currently set up.  I
>>>>> have an older Pi that predates the current distribution images
>>>>> that just uses /dev/mmcsd0... device names in /etc/fstab.  Both
>>>>> approaches work fine.  You just need to make sure the devices
>>>>> you specify in /etc/fstab will actually exist when it comes
>>>>> time to mount the corresponding file system.
>>>> Except that if the root filesystem doesn't mount you get an
>>>> error, and
>>>> thus you can figure out what's going on.  What excuse is there
>>>> for not
>>>> printing an error message if a mount fails, and if something in
>>>> /etc/fstab fails to mount what's with hanging the machine?  I've
>>>> had
>>>> disks be unavailable before on Intel architecture machines (it
>>>> happens
>>>> when disks fail) and the result is an error on the failure to
>>>> mount but,
>>>> unless it's the root volume, the system still comes up.
>>> 
>>> Are you sure you don't get an error?  When I forgot to label rootfs
>>> recently when I cloned an SD card I got an error displayed on the
>>> serial console.  I didn't get an error on the HDMI screen console.
>> You get an error if rootfs is not labelled on the HDMI screen (as
>> root
>> fails to mount.) There is *no* error on an HDMI screen if the msdosfs
>> is
>> not labeled.
>>> As I've mentioned before directly, FreeBSD/arm acts like
>>> console="comconsole,vidconsole" is in effect.  This means that
>>> during /etc/rc boot processing, you'll only get output on
>>> comconsole (except for kernel messages, which seem to go to both). 
>>> That's been my experience in FreeBSD in general.
>>> 
>>> I dimly recall folks on here saying U-Boot doesn't currently
>>> enable/support USB keyboards, so there's not really much you can do
>>> to fix it interactively if you fail to boot the OS and hence enable
>>> USB keyboard support via FreeBSD.  That's not a problem if you use
>>> a serial console, which is supported by U-Boot.
>> Well, that's not true if the kernel is loaded.  Once the kernel loads
>> a
>> usb keyboard works.
>>> 
>>> I'm not sure comparisons with Intel architecture machines is
>>> entirely appropriate as they use a different boot
>>> environment/mechanism.  Still, I stand by the fact that I've always
>>> got an error message on the serial console when disks on my
>>> FreeBSD/arm system have failed to mount at boot.  (It used to
>>> happen regularly with an external USB drive I had that took a long
>>> time to probe, and I ended up having to put a kern.cam.boot_delay
>>> in /boot/loader.conf to avoid the system dropping into single-user
>>> mode when doing a reboot.)
>>> 
>>> 
>>>>> If you stop using labels in your /etc/fstab then you won't have
>>>>> problems when those labels are missing.  If the labels are
>>>>> missing, the /dev/{msdosfs,ufs} devices will not be present and
>>>>> the system will drop to single-user mode because none-late, non
>>>>> -noauto file systems can't be accessed via their device nodes
>>>>> when attempting to mount them.  When that happens and you don't
>>>>> have a serial console enabled then you have problems
>>>>> remediating the situation.
>>>>> 
>>>>> If a file system is not needed to mount as part of booting (as
>>>>> you suggest for /boot/msdos) then you should probably flag it
>>>>> with the "noauto" option in /etc/fstab or remove it from
>>>>> /etc/fstab entirely.
>>>>> 
>>>>> I think the problem you were having is not copying all the
>>>>> required attributes of the file systems in question when
>>>>> cloning your SD cards, given your /etc/fstab setup.  It sounds
>>>>> like you've fixed that, now.
>>>> Again, if it dropped to single user mode *and said it was doing
>>>> so* or
>>>> if there was an error message on the console when the filesystem
>>>> failed
>>>> to mount I would have found this in a reasonable period of time. 
>>>> It
>>>> wasn't that rough to do so with the ufs label once I knew the
>>>> filesystem
>>>> was failing to mount, which was discernible from the console
>>>> output.
>>>> 
>>>> Not printing an error when things error out is rude at best, and
>>>> when
>>>> that error is going to prevent the system from coming up this
>>>> darn well
>>>> ought to show up where one with a monitor plugged in can see it,
>>>> eh?
>>>> 
>>>> There was literally no indication at all as to what was going on
>>>> and
>>>> since gpart does not show filesystem labels for *either* BSD
>>>> labeled
>>>> slices OR msdos figuring out what was different between the two
>>>> proved
>>>> to be a bit troublesome.  IMHO at least the failure to display an
>>>> error
>>>> message in this circumstance ought to be corrected.
>>> 
>>> See above re: serial console vs. video console.
>>> 
>>> As for the labels, these are file system labels and not partition
>>> labels.  The big clue is in the device name in /etc/fstab.  (The "
>>> -l" option to "gpart show" will only show labels "[f]or
>>> partitioning schemes that support partition labels".  That's
>>> reasonable, IMHO, as partitions are not the same as file systems
>>> and gpart is concerned with partitions.)  In my experience,
>>> complaints about not being able to access /dev/ufs/something means
>>> you forgot to label a UFS file system as "something" when you made
>>> it. :-)
>>> 
>>> Cheers,
>>> 
>>> Paul.
>> 
>> Understood, but the issue here is that there's no indication without
>> a
>> serial console that you have anything wrong -- the system appears to
>> have simply hung.
>> 
>> The quick fix is to put "failok" (or noauto) in the default
>> /etc/fstab
>> entry for the dos filesystem, since it is not necessary for that
>> filesystem to be mounted at all on a running machine.  If there is a
>> policy reason to leave it accessible (and there's a fairly-clean
>> argument that there is) then "failok" might be preferable to
>> "noauto",
>> but either way forcing a filesystem that is not necessary to be
>> accessible or the system fails to come up and does not give any
>> indication of same on what many users will have accessible to them is
>> facially wrong.
>> 
>> These devices are thought of as "appliances" by many and as such the
>> model of USB keyboard + HDMI (e.g. TV or monitor) is entirely
>> reasonable, and IMHO FreeBSD ought to, when possible, make that a
>> viable
>> option.  It both is and can be provided the kernel loads, but the
>> defaults in pre-built configurations right now preclude that.
>> 
> 
> I'm having a hard time understanding how a problem report got generated
> about all this, or how any of it is anything other than "Karl
> misconfigured his system."
> 
> The downloadable system images work correctly.  You made a local change
> (formatted new media) and depending on how you want to look at it,
> either you didn't format correctly or you didn't make your config files
> match the way you formatted, and that made your system stop working. 
> It doesn't mean there is anything wrong about the way the downloadable
> images are generated.
> 
> Changing fstab in the distributed images so that a failure to mount a
> filesystem becomes a non-error seems like a bad idea to me.  The only
> way that problem happens with a downloaded image is if the image wasn't
> burned successfully, and that doesn't seem like something that needs to
> just get papered over just because in your use-case you don't really
> need the filesystem that failed to mount.
> 
> A PR about the fact that it hung without visibly reporting an error may
> be appropriate.  A PR that says we should just paper over the error
> because you don't care about it doesn't seem appropriate.


Maybe it should be filed as a "feature request" rather than a "bug."  Does Bugzilla support the distinction?

I agree with Ian that this is not a bug in the sense that anyone installing from the distributed images will never trigger it on their install media.

It is reasonable to file a feature request to omit /boot/msdos as a mandatory mount.  I think when I first was using FreeBSD/arm on my Raspberry Pi it wasn't mounted, but then that predates the current distribution images.  Now it is.  I can see arguments either way and the current setting makes sense to me.  I think Warner and Ian hit the nail on the head that the real issue is the lack of output on the video console during /etc/rc processing.

Incidentally, does setting console="vidconsole" in /boot/loader.conf fix the problem of a lack of /etc/rc messages for those who are using an HDMI monitor as their primary/only console?  If so, there may also be a case for making that the default if the assumption is that a minority of people will be using a serial console.  (Not a fair assumption right now, IMHO, but perhaps a fair one going forward as FreeBSD/arm becomes Tier 1.)

Cheers,

Paul.



More information about the freebsd-arm mailing list