[Bug 220971] Freebsd 11.0p11 - system freeze on intensive I/O

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Sat Aug 12 22:35:17 UTC 2017


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=220971

--- Comment #8 from Mark Millard <markmi at dsl-only.net> ---
(In reply to execve from comment #7)

I tried a couple of variations of the experiment
that I suggested. Unfortunately the results are
a little complicated to interpret.

Context: under virtualbox
(on Windows 10 Pro) with. . .
(Bugzilla 206048 has pointed out
reproducibility under virtual
machines.)

FreeBSDx64OPC11S# uname -apKU
FreeBSD FreeBSDx64OPC11S 11.1-STABLE FreeBSD 11.1-STABLE  r322433M  amd64 amd64
1101501 1101501

# svnlite diff /usr/src/
Index: /usr/src/sys/amd64/conf/GENERIC
===================================================================
--- /usr/src/sys/amd64/conf/GENERIC     (revision 322433)
+++ /usr/src/sys/amd64/conf/GENERIC     (working copy)
@@ -24,7 +24,8 @@
 makeoptions    DEBUG=-g                # Build kernel with gdb(1) debug
symbols
 makeoptions    WITH_CTF=1              # Run ctfconvert(1) for DTrace support

-options        SCHED_ULE               # ULE scheduler
+#options       SCHED_ULE               # ULE scheduler
+options        SCHED_4BSD              # 4BSD scheduler
 options        PREEMPTION              # Enable kernel thread preemption
 options        INET                    # InterNETworking
 options        INET6                   # IPv6 communications protocols

I tried:

4 processors and 1 GiBYte of RAM assigned
using: stress -d 2 -m 3 --vm-keep

and separately:

8 processors and 1 GiByte of RAM assigned
using: stress -d 6 -m 3 --vm-keep

I had a top -Cawopid running in each
case with its own ssh into the virtual
machine. stress was via ssh as well.

In the 2nd case I got to a lock-up: top
stopped updating and input was ignored
to both the ssh's (top and stress) and
the console window, including input
such as ^C and ^T .

The console window did eventually show:

swap_pager: I/O error - pageout failed; blkno 7367,size 4096, error 12

(After seeing that I waited a while longer but I gave up
on waiting and eventually killed the virtual machine.)

I later found a list message reporting about such
"error 12" variants of the message:

QUOTE
> I think it might be ENOMEM from a geom when trying to g_clone_bio.
. . .
It shouldn't happen, but you should notice no ill effects (that is, the
page isn't lost, it just wasn't paged out and there's a few bytes less
that the pager could do at the moment).
END QUOTE.

As for the lock-up structure. . .

Unfortunately top did not happen to update showing any
of the lock up structure in other processes before
it locked up.

It does at least appear not as easy to get a lock-up
(or get ENOMEM and failure to page out) with
SCHED_4BSD (to the degree that just a couple of tests
indicate anything about such). But getting stuck
appears possible and pageout's can fail to happen
for lack of memory, or so it appears.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list