vnode_pager_generic_getpages_done: I/O read error 5 caused by r318394 (was Re: FreeBSD 11.1-BETA1 Now Available)

Guido Falsi madpilot at FreeBSD.org
Thu Jun 22 17:06:20 UTC 2017


On 06/22/17 18:38, Warner Losh wrote:
> 
> 
> On Thu, Jun 22, 2017 at 2:26 AM, Guido Falsi <madpilot at freebsd.org 
> <mailto:madpilot at freebsd.org>> wrote:
> 
>     On 06/21/17 16:59, Guido Falsi wrote:
>     > On 06/13/17 13:44, Peter Blok wrote:
>     >> Hi,
>     >>
>     >> For a while now, I’m not able to build a RPI1-B image from -stable. I have narrowed it dow to fix 318394, which adds a refresh option to geom_label. If I undo this fix in today’s stable it works ok. If I don’t I’m getting continuously:
>     >>
>     >> vm_fault: pager read error, pid 1 (init)
>     >> vnode_pager_generic_getpages_done: I/O read error 5
>     >>
>     >> I have looked at the fix and I can’t figure out why it breaks the code.
>     >>
>     >> And yes I have tried various other SD cards - they all have the same issue.
>     >>
>     >
>     > Hi,
>     >
>     > I'm seeing similar symptoms with NanoBSD images on PCEngines ALIX and
>     > APU2 boards, using compactflash and SD card storage respectively. The
>     > problem has appeared as soon as I started testing 11.1-BETA1 from the
>     > stable branch.
>     >
>     > Problem appears when I update the image, using a slightly modified
>     > version of the standard nanobsd update and updatep[12] scripts. My
>     > changes are not in the dd/gpart commands though, which are the same.
>     > gpart seems the most likely candidate though.
>     >
>     > I have just discovered this thread and I will test reverting r318394
>     > soon. Thanks to Peter for narrowing it down!
>     >
>     > Maybe this is related to having the disks mounted read-only?
>     >
> 
>     I noticed that after the problem appears many commands, including
>     shutdown, start failing telling "device not configured" for all mounted
>     FSes. I'm even unable to "ls /dev".
> 
>     Looks like the geom refresh changes devices from below the system in a
>     way which triggers this reaction.
> 
>     I don't know the geom code and have been unable to find an immediate
>     problem in the commit mentioned above. I'd really like some help to know
>     where to look, or what kind of debugging information is needed.
> 
>     This is quite a bad bug for people running NanoBSD and should be fixed
>     before the release.
> 
> 
> So can I recreate this with the embedded-type NanoBSD image? If this 
> change breaks NanoBSD, it will need to be reverted...
> 

You should be able to reproduce it with a nanobsd image, then updating 
it using the standard script which dumps the new image in the "other" 
partition and uses gpart to configure the new partition as bootable.

I'm using a slightly modified update script which also mounts the new 
partition in /mnt  and performs some operations there. Then it dismounts 
the partition and launches the "gpart set -a active -i ${_to} 
${NANO_DRIVE}" command (which I suspect is exactly where the actual 
problem is happening).

I also tested reverting the change and can confirm that it makes the 
problem go away.

I'm sure it can be triggered by other gpart operations. I'm trying to 
understand exactly which operations.

I'll followup as soon as I have easier use case to reproduce it. I first 
need to revert to an image affected by the problem.

Thanks for your feedback!

-- 
Guido Falsi <madpilot at FreeBSD.org>


More information about the freebsd-stable mailing list