Builworld stalls on rpi2 [various processes stuck in pfault and vmwait with 1996M Free Swap listed by top]

Mark Millard markmi at dsl-only.net
Sat Jan 13 08:59:48 UTC 2018


On 2018-Jan-12, at 4:54 PM, bob prohaska <fbsd at www.zefox.net> wrote:

> Trying to self-host a build of r327859 using a GENERIC kernel at  
> r327664, make seems to stall, with top showing
> 
> last pid: 28822;  load averages:  3.12,  3.95,  5.09    up 0+08:39:01  16:39:49
> 50 processes:  1 running, 47 sleeping, 2 waiting
> CPU:  0.0% user,  0.0% nice,  0.2% system,  0.9% interrupt, 98.9% idle
> Mem: 527M Active, 16M Inact, 98M Laundry, 148M Wired, 86M Buf, 3272K Free
> Swap: 2048M Total, 52M Used, 1996M Free, 2% Inuse
> 
>  PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
>  769 bob           1  20    0  6204K  1344K CPU0    0   3:02   0.68% top
>  674 bob           1  20    0 11188K  1636K select  1   0:18   0.04% sshd
>  719 root          1  20    0  4572K   552K select  0   0:03   0.01% make
> 28760 root          1  52    0   346M   302M pfault  2  13:59   0.00% c++
> 28812 root          1  52    0   208M   167M pfault  2   2:54   0.00% c++
> 28815 root          1  52    0   212M   171M pfault  1   2:20   0.00% c++
> 22172 root          1  20    0 13036K  4484K select  2   2:09   0.00% make
> 28820 root          1  52    0   145M   104M pfault  1   2:00   0.00% c++
> 21438 root          1  20    0  7092K   556K select  0   0:05   0.00% make
>  695 root          1  20    0  4016K   552K select  1   0:04   0.00% make
>  593 root          1  20    0  8156K  1596K vmwait  1   0:04   0.00% sendmail
> 20119 root          1  20    0  4516K   548K select  1   0:03   0.00% make
> 21427 root          1  20    0  4484K   556K select  2   0:02   0.00% make
>  590 root          1  20    0 10148K  1552K vmwait  1   0:02   0.00% sshd
> 22168 root          1  20    0  3956K   560K select  0   0:02   0.00% make
>  600 root          1  20    0  4960K     0K WAIT    2   0:01   0.00% <cron>
>  461 root          1  20    0  4916K  1020K select  1   0:01   0.00% syslogd
> 
> The machine seems dead, none of the ssh sessions responds to keystrokes, 
> nor the serial console. There are a smattering of 
> smsc0: warning: Failed to write register 0x114
> smsc0: warning: Failed to read register 0x114
> smsc0: warning: MII is busy
> smsc0: warning: Failed to write register 0x114
> 
> The machine still answers ping. Typing escape control-b does not
> bring up a debugger, did the keysequence change? Power cycling seems
> to be the only way out.

With or without:

options         ALT_BREAK_TO_DEBUGGER

For with: <CR>~^B (with <CR> being <return>
and ^ being <control>) is an alternate with
this.

I've see the smsc0 messages before but I'm
not up to -r327664+ yet. This has been with
a non-debug kernel running.

I've had building large ports get into such states,
especially while at least one large link operation
was active with other fairly large processes,
as I remember.

Note all the pfault and vmwait lines. It looks like
-r327316 and -r327468 did not happen to avoid this.
It looks like the paging/swaping has gotten stuck
in some way. How tied that might be to smsc0
messages, I've no clue.

You might get through by using -j3 or -j2 or -j1 which
likely would use less process space at once (worst case)
than -j4 happened to.

Of course there are other time consequences as you
approach -j1 (or no explicit -j for the buildworld
at all).

===
Mark Millard
markmi at dsl-only.net



More information about the freebsd-hackers mailing list