RPI3 swap experiments

Warner Losh imp at bsdimp.com
Fri Jun 29 23:37:52 UTC 2018

On Fri, Jun 29, 2018 at 4:54 PM, bob prohaska <fbsd at www.zefox.net> wrote:

> On Wed, Jun 27, 2018 at 06:12:53PM -0700, Mark Millard wrote:
> >
> > It would be handy for something simpler than buildworld to be
> > able to induce some of the types of problems, for sure.
> >
> It appears that running
> root at www:/home/bob/stress2/misc # sh ./snap.sh > 5thstress2.log &
> triggers a stream of console errors that resemble those produced
> by -j4 buildworld:
> GEOM: md10: invalid disklabel.
> GEOM: ufsid/5a5a128b05c5944d: invalid disklabel.
> (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 00 18 a9 80 00 00 30 00
> (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain
> (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 00 18 a9 80 00 00 30 00

The drive is broken somewhere at or below the SIM. Reasonable write
requests from CAM's periph da are being rejected. This tells me there's a
problem. In this case, the SIM is umass which additionally has the USB
stack under it. If I had to wager, I'd wager more on the USB driver being
wonky than umass. Something is happening, and the SIM is returning an error
to the periph. You're next step in tracking this down would be to see why
by instrumenting umass' xpt_done calls that complete the CCBs that are
queued by the sim's action routine.

The error stream continues until powercycling, escape to debugger does
> not work. To some extent, the machine remains responsive in ssh sessions,
> at least for simple commands (ls -l, tail). The stress2 logfile is empty.

Yes. Something early in this sequence breaks the drive. Since the errors
are CAM errors like this, they are at the umass-sim0 layer or lower.

> The details captured are visible at
> http://www.zefox.net/~fbsd/rpi3/swaptests/r335655/
> 1gbsdflash/stress2/5thtest/
> One oddity is a stream of indisplayable characters, briefly, in
> http://www.zefox.net/~fbsd/rpi3/swaptests/r335655/
> 1gbsdflash/stress2/5thtest/5thswapuse.log
> following the initial error message.
> Another oddity is that during a single user reboot similar errors are
> briefly
> displayed before starting the shell. After two rounds of fsck -f they seem
> to
> vanish, not to return without appropriate provocation. This does not
> happen on
> every reboot, but enough to be no-longer surprising.
> It looks as if this particular error message has nothing at all to do with
> swap.
> Does it seem in any way related to similar errors seen during swap-stressed
> buildworld sessions?

Looks like the same to me, and I have the same diagnosis.


> Thanks for reading,
> bob prohaska
> _______________________________________________
> freebsd-arm at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-arm
> To unsubscribe, send any mail to "freebsd-arm-unsubscribe at freebsd.org"

More information about the freebsd-arm mailing list