svn commit: r347532 - in head: contrib/netbsd-tests/lib/libc/sys lib/libc/sys lib/libc/tests/sys sys/amd64/vmm sys/sys sys/vm usr.bin/vmstat

Mark Johnston markj at FreeBSD.org
Mon May 13 16:38:52 UTC 2019


Author: markj
Date: Mon May 13 16:38:48 2019
New Revision: 347532
URL: https://svnweb.freebsd.org/changeset/base/347532

Log:
  Provide separate accounting for user-wired pages.
  
  Historically we have not distinguished between kernel wirings and user
  wirings for accounting purposes.  User wirings (via mlock(2)) were
  subject to a global limit on the number of wired pages, so if large
  swaths of physical memory were wired by the kernel, as happens with
  the ZFS ARC among other things, the limit could be exceeded, causing
  user wirings to fail.
  
  The change adds a new counter, v_user_wire_count, which counts the
  number of virtual pages wired by user processes via mlock(2) and
  mlockall(2).  Only user-wired pages are subject to the system-wide
  limit which helps provide some safety against deadlocks.  In
  particular, while sources of kernel wirings typically support some
  backpressure mechanism, there is no way to reclaim user-wired pages
  shorting of killing the wiring process.  The limit is exported as
  vm.max_user_wired, renamed from vm.max_wired, and changed from u_int
  to u_long.
  
  The choice to count virtual user-wired pages rather than physical
  pages was done for simplicity.  There are mechanisms that can cause
  user-wired mappings to be destroyed while maintaining a wiring of
  the backing physical page; these make it difficult to accurately
  track user wirings at the physical page layer.
  
  The change also closes some holes which allowed user wirings to succeed
  even when they would cause the system limit to be exceeded.  For
  instance, mmap() may now fail with ENOMEM in a process that has called
  mlockall(MCL_FUTURE) if the new mapping would cause the user wiring
  limit to be exceeded.
  
  Note that bhyve -S is subject to the user wiring limit, which defaults
  to 1/3 of physical RAM.  Users that wish to exceed the limit must tune
  vm.max_user_wired.
  
  Reviewed by:	kib, ngie (mlock() test changes)
  Tested by:	pho (earlier version)
  MFC after:	45 days
  Sponsored by:	Netflix
  Differential Revision:	https://reviews.freebsd.org/D19908

Modified:
  head/contrib/netbsd-tests/lib/libc/sys/t_mlock.c
  head/lib/libc/sys/mlock.2
  head/lib/libc/sys/mlockall.2
  head/lib/libc/tests/sys/mlock_helper.c
  head/sys/amd64/vmm/vmm.c
  head/sys/sys/vmmeter.h
  head/sys/vm/vm_glue.c
  head/sys/vm/vm_map.c
  head/sys/vm/vm_map.h
  head/sys/vm/vm_meter.c
  head/sys/vm/vm_mmap.c
  head/sys/vm/vm_pageout.c
  head/sys/vm/vm_pageout.h
  head/sys/vm/vm_unix.c
  head/usr.bin/vmstat/vmstat.c

Modified: head/contrib/netbsd-tests/lib/libc/sys/t_mlock.c
==============================================================================
--- head/contrib/netbsd-tests/lib/libc/sys/t_mlock.c	Mon May 13 15:39:54 2019	(r347531)
+++ head/contrib/netbsd-tests/lib/libc/sys/t_mlock.c	Mon May 13 16:38:48 2019	(r347532)
@@ -51,7 +51,7 @@ __RCSID("$NetBSD: t_mlock.c,v 1.6 2016/08/09 12:02:44 
 #define _KMEMUSER
 #include <machine/vmparam.h>
 
-void set_vm_max_wired(int);
+void set_vm_max_wired(u_long);
 void restore_vm_max_wired(void);
 #endif
 

Modified: head/lib/libc/sys/mlock.2
==============================================================================
--- head/lib/libc/sys/mlock.2	Mon May 13 15:39:54 2019	(r347531)
+++ head/lib/libc/sys/mlock.2	Mon May 13 16:38:48 2019	(r347532)
@@ -28,7 +28,7 @@
 .\"	@(#)mlock.2	8.2 (Berkeley) 12/11/93
 .\" $FreeBSD$
 .\"
-.Dd March 20, 2018
+.Dd May 13, 2019
 .Dt MLOCK 2
 .Os
 .Sh NAME
@@ -97,13 +97,13 @@ resource limit and the
 system-wide
 .Dq wired pages
 limit
-.Va vm.max_wired .
-.Va vm.max_wired
+.Va vm.max_user_wired .
+.Va vm.max_user_wired
 applies to the system as a whole, so the amount available to a single
 process at any given time is the difference between
-.Va vm.max_wired
+.Va vm.max_user_wired
 and
-.Va vm.stats.vm.v_wire_count .
+.Va vm.stats.vm.v_user_wire_count .
 .Pp
 If
 .Va security.bsd.unprivileged_mlock
@@ -124,13 +124,11 @@ will fail if:
 is set to 0 and the caller is not the super-user.
 .It Bq Er EINVAL
 The address range given wraps around zero.
-.It Bq Er EAGAIN
-Locking the indicated range would exceed the system limit for locked memory.
 .It Bq Er ENOMEM
 Some portion of the indicated address range is not allocated.
 There was an error faulting/mapping a page.
-Locking the indicated range would exceed the per-process limit for locked
-memory.
+Locking the indicated range would exceed the per-process or system-wide limits
+for locked memory.
 .El
 The
 .Fn munlock
@@ -171,11 +169,11 @@ system calls first appeared in
 Allocating too much wired memory can lead to a memory-allocation deadlock
 which requires a reboot to recover from.
 .Pp
-The per-process resource limit is a limit on the amount of virtual
-memory locked, while the system-wide limit is for the number of locked
-physical pages.
-Hence a process with two distinct locked mappings of the same physical page
-counts as 2 pages against the per-process limit and as only a single page
-in the system limit.
+The per-process and system-wide resource limits of locked memory apply
+to the amount of virtual memory locked, not the amount of locked physical
+pages.
+Hence two distinct locked mappings of the same physical page counts as
+2 pages aginst the system limit, and also against the per-process limit
+if both mappings belong to the same physical map.
 .Pp
 The per-process resource limit is not currently supported.

Modified: head/lib/libc/sys/mlockall.2
==============================================================================
--- head/lib/libc/sys/mlockall.2	Mon May 13 15:39:54 2019	(r347531)
+++ head/lib/libc/sys/mlockall.2	Mon May 13 16:38:48 2019	(r347532)
@@ -30,7 +30,7 @@
 .\"
 .\" $FreeBSD$
 .\"
-.Dd December 25, 2012
+.Dd May 13, 2019
 .Dt MLOCKALL 2
 .Os
 .Sh NAME
@@ -69,7 +69,7 @@ limited in how much they can lock down.
 A single process can lock the minimum of a system-wide
 .Dq wired pages
 limit
-.Va vm.max_wired
+.Va vm.max_user_wired
 and the per-process
 .Dv RLIMIT_MEMLOCK
 resource limit.
@@ -138,9 +138,9 @@ and
 functions first appeared in
 .Fx 5.1 .
 .Sh BUGS
-The per-process resource limit is a limit on the amount of virtual
-memory locked, while the system-wide limit is for the number of locked
-physical pages.
-Hence a process with two distinct locked mappings of the same physical page
-counts as 2 pages against the per-process limit and as only a single page
-in the system limit.
+The per-process and system-wide resource limits of locked memory apply
+to the amount of virtual memory locked, not the amount of locked physical
+pages.
+Hence two distinct locked mappings of the same physical page counts as
+2 pages aginst the system limit, and also against the per-process limit
+if both mappings belong to the same physical map.

Modified: head/lib/libc/tests/sys/mlock_helper.c
==============================================================================
--- head/lib/libc/tests/sys/mlock_helper.c	Mon May 13 15:39:54 2019	(r347531)
+++ head/lib/libc/tests/sys/mlock_helper.c	Mon May 13 16:38:48 2019	(r347532)
@@ -39,16 +39,16 @@ __FBSDID("$FreeBSD$");
 #include <limits.h>
 #include <stdio.h>
 
-#define	VM_MAX_WIRED "vm.max_wired"
+#define	VM_MAX_WIRED "vm.max_user_wired"
 
 static void
-vm_max_wired_sysctl(int *old_value, int *new_value)
+vm_max_wired_sysctl(u_long *old_value, u_long *new_value)
 {
 	size_t old_len;
-	size_t new_len = (new_value == NULL ? 0 : sizeof(int));
+	size_t new_len = (new_value == NULL ? 0 : sizeof(*new_value));
 
 	if (old_value == NULL)
-		printf("Setting the new value to %d\n", *new_value);
+		printf("Setting the new value to %lu\n", *new_value);
 	else {
 		ATF_REQUIRE_MSG(sysctlbyname(VM_MAX_WIRED, NULL, &old_len,
 		    new_value, new_len) == 0,
@@ -60,14 +60,14 @@ vm_max_wired_sysctl(int *old_value, int *new_value)
 	    "sysctlbyname(%s) failed: %s", VM_MAX_WIRED, strerror(errno));
 
 	if (old_value != NULL)
-		printf("Saved the old value (%d)\n", *old_value);
+		printf("Saved the old value (%lu)\n", *old_value);
 }
 
 void
-set_vm_max_wired(int new_value)
+set_vm_max_wired(u_long new_value)
 {
 	FILE *fp;
-	int old_value;
+	u_long old_value;
 
 	fp = fopen(VM_MAX_WIRED, "w");
 	if (fp == NULL) {
@@ -78,7 +78,7 @@ set_vm_max_wired(int new_value)
 
 	vm_max_wired_sysctl(&old_value, NULL);
 
-	ATF_REQUIRE_MSG(fprintf(fp, "%d", old_value) > 0,
+	ATF_REQUIRE_MSG(fprintf(fp, "%lu", old_value) > 0,
 	    "saving %s failed", VM_MAX_WIRED);
 
 	fclose(fp);
@@ -90,7 +90,7 @@ void
 restore_vm_max_wired(void)
 {
 	FILE *fp;
-	int saved_max_wired;
+	u_long saved_max_wired;
 
 	fp = fopen(VM_MAX_WIRED, "r");
 	if (fp == NULL) {
@@ -98,14 +98,14 @@ restore_vm_max_wired(void)
 		return;
 	}
 
-	if (fscanf(fp, "%d", &saved_max_wired) != 1) {
+	if (fscanf(fp, "%lu", &saved_max_wired) != 1) {
 		perror("fscanf failed\n");
 		fclose(fp);
 		return;
 	}
 
 	fclose(fp);
-	printf("old value in %s: %d\n", VM_MAX_WIRED, saved_max_wired);
+	printf("old value in %s: %lu\n", VM_MAX_WIRED, saved_max_wired);
 
 	if (saved_max_wired == 0) /* This will cripple the test host */
 		return;

Modified: head/sys/amd64/vmm/vmm.c
==============================================================================
--- head/sys/amd64/vmm/vmm.c	Mon May 13 15:39:54 2019	(r347531)
+++ head/sys/amd64/vmm/vmm.c	Mon May 13 16:38:48 2019	(r347532)
@@ -754,7 +754,8 @@ vm_mmap_memseg(struct vm *vm, vm_paddr_t gpa, int segi
 		    VM_MAP_WIRE_USER | VM_MAP_WIRE_NOHOLES);
 		if (error != KERN_SUCCESS) {
 			vm_map_remove(&vm->vmspace->vm_map, gpa, gpa + len);
-			return (EFAULT);
+			return (error == KERN_RESOURCE_SHORTAGE ? ENOMEM :
+			    EFAULT);
 		}
 	}
 

Modified: head/sys/sys/vmmeter.h
==============================================================================
--- head/sys/sys/vmmeter.h	Mon May 13 15:39:54 2019	(r347531)
+++ head/sys/sys/vmmeter.h	Mon May 13 16:38:48 2019	(r347532)
@@ -153,6 +153,8 @@ extern domainset_t vm_severe_domains;
 #define	VM_CNT_INC(var)		VM_CNT_ADD(var, 1)
 #define	VM_CNT_FETCH(var)	counter_u64_fetch(vm_cnt.var)
 
+extern u_long vm_user_wire_count;
+
 static inline void
 vm_wire_add(int cnt)
 {

Modified: head/sys/vm/vm_glue.c
==============================================================================
--- head/sys/vm/vm_glue.c	Mon May 13 15:39:54 2019	(r347531)
+++ head/sys/vm/vm_glue.c	Mon May 13 16:38:48 2019	(r347532)
@@ -181,21 +181,8 @@ vslock(void *addr, size_t len)
 	if (last < (vm_offset_t)addr || end < (vm_offset_t)addr)
 		return (EINVAL);
 	npages = atop(end - start);
-	if (npages > vm_page_max_wired)
+	if (npages > vm_page_max_user_wired)
 		return (ENOMEM);
-#if 0
-	/*
-	 * XXX - not yet
-	 *
-	 * The limit for transient usage of wired pages should be
-	 * larger than for "permanent" wired pages (mlock()).
-	 *
-	 * Also, the sysctl code, which is the only present user
-	 * of vslock(), does a hard loop on EAGAIN.
-	 */
-	if (npages + vm_wire_count() > vm_page_max_wired)
-		return (EAGAIN);
-#endif
 	error = vm_map_wire(&curproc->p_vmspace->vm_map, start, end,
 	    VM_MAP_WIRE_SYSTEM | VM_MAP_WIRE_NOHOLES);
 	if (error == KERN_SUCCESS) {

Modified: head/sys/vm/vm_map.c
==============================================================================
--- head/sys/vm/vm_map.c	Mon May 13 15:39:54 2019	(r347531)
+++ head/sys/vm/vm_map.c	Mon May 13 16:38:48 2019	(r347532)
@@ -90,6 +90,7 @@ __FBSDID("$FreeBSD$");
 #include <vm/pmap.h>
 #include <vm/vm_map.h>
 #include <vm/vm_page.h>
+#include <vm/vm_pageout.h>
 #include <vm/vm_object.h>
 #include <vm/vm_pager.h>
 #include <vm/vm_kern.h>
@@ -2917,12 +2918,12 @@ done:
 
 		if (rv == KERN_SUCCESS && (!user_unwire ||
 		    (entry->eflags & MAP_ENTRY_USER_WIRED))) {
-			if (user_unwire)
-				entry->eflags &= ~MAP_ENTRY_USER_WIRED;
 			if (entry->wired_count == 1)
 				vm_map_entry_unwire(map, entry);
 			else
 				entry->wired_count--;
+			if (user_unwire)
+				entry->eflags &= ~MAP_ENTRY_USER_WIRED;
 		}
 		KASSERT((entry->eflags & MAP_ENTRY_IN_TRANSITION) != 0,
 		    ("vm_map_unwire: in-transition flag missing %p", entry));
@@ -2942,6 +2943,28 @@ done:
 	return (rv);
 }
 
+static void
+vm_map_wire_user_count_sub(u_long npages)
+{
+
+	atomic_subtract_long(&vm_user_wire_count, npages);
+}
+
+static bool
+vm_map_wire_user_count_add(u_long npages)
+{
+	u_long wired;
+
+	wired = vm_user_wire_count;
+	do {
+		if (npages + wired > vm_page_max_user_wired)
+			return (false);
+	} while (!atomic_fcmpset_long(&vm_user_wire_count, &wired,
+	    npages + wired));
+
+	return (true);
+}
+
 /*
  *	vm_map_wire_entry_failure:
  *
@@ -2978,37 +3001,49 @@ vm_map_wire_entry_failure(vm_map_t map, vm_map_entry_t
 	entry->wired_count = -1;
 }
 
+int
+vm_map_wire(vm_map_t map, vm_offset_t start, vm_offset_t end, int flags)
+{
+	int rv;
+
+	vm_map_lock(map);
+	rv = vm_map_wire_locked(map, start, end, flags);
+	vm_map_unlock(map);
+	return (rv);
+}
+
+
 /*
- *	vm_map_wire:
+ *	vm_map_wire_locked:
  *
- *	Implements both kernel and user wiring.
+ *	Implements both kernel and user wiring.  Returns with the map locked,
+ *	the map lock may be dropped.
  */
 int
-vm_map_wire(vm_map_t map, vm_offset_t start, vm_offset_t end,
-    int flags)
+vm_map_wire_locked(vm_map_t map, vm_offset_t start, vm_offset_t end, int flags)
 {
 	vm_map_entry_t entry, first_entry, tmp_entry;
 	vm_offset_t faddr, saved_end, saved_start;
-	unsigned int last_timestamp;
+	u_long npages;
+	u_int last_timestamp;
 	int rv;
 	boolean_t need_wakeup, result, user_wire;
 	vm_prot_t prot;
 
+	VM_MAP_ASSERT_LOCKED(map);
+
 	if (start == end)
 		return (KERN_SUCCESS);
 	prot = 0;
 	if (flags & VM_MAP_WIRE_WRITE)
 		prot |= VM_PROT_WRITE;
 	user_wire = (flags & VM_MAP_WIRE_USER) ? TRUE : FALSE;
-	vm_map_lock(map);
 	VM_MAP_RANGE_CHECK(map, start, end);
 	if (!vm_map_lookup_entry(map, start, &first_entry)) {
 		if (flags & VM_MAP_WIRE_HOLESOK)
 			first_entry = first_entry->next;
-		else {
-			vm_map_unlock(map);
+		else
 			return (KERN_INVALID_ADDRESS);
-		}
 	}
 	last_timestamp = map->timestamp;
 	entry = first_entry;
@@ -3042,7 +3077,6 @@ vm_map_wire(vm_map_t map, vm_offset_t start, vm_offset
 							/*
 							 * first_entry has been deleted.
 							 */
-							vm_map_unlock(map);
 							return (KERN_INVALID_ADDRESS);
 						}
 						end = saved_start;
@@ -3082,13 +3116,22 @@ vm_map_wire(vm_map_t map, vm_offset_t start, vm_offset
 		}
 		if (entry->wired_count == 0) {
 			entry->wired_count++;
-			saved_start = entry->start;
-			saved_end = entry->end;
 
+			npages = atop(entry->end - entry->start);
+			if (user_wire && !vm_map_wire_user_count_add(npages)) {
+				vm_map_wire_entry_failure(map, entry,
+				    entry->start);
+				end = entry->end;
+				rv = KERN_RESOURCE_SHORTAGE;
+				goto done;
+			}
+
 			/*
 			 * Release the map lock, relying on the in-transition
 			 * mark.  Mark the map busy for fork.
 			 */
+			saved_start = entry->start;
+			saved_end = entry->end;
 			vm_map_busy(map);
 			vm_map_unlock(map);
 
@@ -3136,6 +3179,8 @@ vm_map_wire(vm_map_t map, vm_offset_t start, vm_offset
 			last_timestamp = map->timestamp;
 			if (rv != KERN_SUCCESS) {
 				vm_map_wire_entry_failure(map, entry, faddr);
+				if (user_wire)
+					vm_map_wire_user_count_sub(npages);
 				end = entry->end;
 				goto done;
 			}
@@ -3201,9 +3246,12 @@ done:
 			 * Undo the wiring.  Wiring succeeded on this entry
 			 * but failed on a later entry.  
 			 */
-			if (entry->wired_count == 1)
+			if (entry->wired_count == 1) {
 				vm_map_entry_unwire(map, entry);
-			else
+				if (user_wire)
+					vm_map_wire_user_count_sub(
+					    atop(entry->end - entry->start));
+			} else
 				entry->wired_count--;
 		}
 	next_entry_done:
@@ -3220,7 +3268,6 @@ done:
 		}
 		vm_map_simplify_entry(map, entry);
 	}
-	vm_map_unlock(map);
 	if (need_wakeup)
 		vm_map_wakeup(map);
 	return (rv);
@@ -3338,13 +3385,18 @@ vm_map_sync(
 static void
 vm_map_entry_unwire(vm_map_t map, vm_map_entry_t entry)
 {
+	vm_size_t size;
 
 	VM_MAP_ASSERT_LOCKED(map);
 	KASSERT(entry->wired_count > 0,
 	    ("vm_map_entry_unwire: entry %p isn't wired", entry));
+
+	size = entry->end - entry->start;
+	if ((entry->eflags & MAP_ENTRY_USER_WIRED) != 0)
+		vm_map_wire_user_count_sub(atop(size));
 	pmap_unwire(map->pmap, entry->start, entry->end);
-	vm_object_unwire(entry->object.vm_object, entry->offset, entry->end -
-	    entry->start, PQ_ACTIVE);
+	vm_object_unwire(entry->object.vm_object, entry->offset, size,
+	    PQ_ACTIVE);
 	entry->wired_count = 0;
 }
 
@@ -4311,12 +4363,11 @@ retry:
 	 * Heed the MAP_WIREFUTURE flag if it was set for this process.
 	 */
 	if (rv == KERN_SUCCESS && (map->flags & MAP_WIREFUTURE) != 0) {
-		vm_map_unlock(map);
-		vm_map_wire(map, grow_start, grow_start + grow_amount,
+		rv = vm_map_wire_locked(map, grow_start,
+		    grow_start + grow_amount,
 		    VM_MAP_WIRE_USER | VM_MAP_WIRE_NOHOLES);
-		vm_map_lock_read(map);
-	} else
-		vm_map_lock_downgrade(map);
+	}
+	vm_map_lock_downgrade(map);
 
 out:
 #ifdef RACCT

Modified: head/sys/vm/vm_map.h
==============================================================================
--- head/sys/vm/vm_map.h	Mon May 13 15:39:54 2019	(r347531)
+++ head/sys/vm/vm_map.h	Mon May 13 16:38:48 2019	(r347532)
@@ -422,7 +422,8 @@ int vm_map_madvise (vm_map_t, vm_offset_t, vm_offset_t
 int vm_map_stack (vm_map_t, vm_offset_t, vm_size_t, vm_prot_t, vm_prot_t, int);
 int vm_map_unwire(vm_map_t map, vm_offset_t start, vm_offset_t end,
     int flags);
-int vm_map_wire(vm_map_t map, vm_offset_t start, vm_offset_t end,
+int vm_map_wire(vm_map_t map, vm_offset_t start, vm_offset_t end, int flags);
+int vm_map_wire_locked(vm_map_t map, vm_offset_t start, vm_offset_t end,
     int flags);
 long vmspace_swap_count(struct vmspace *vmspace);
 void vm_map_entry_set_vnode_text(vm_map_entry_t entry, bool add);

Modified: head/sys/vm/vm_meter.c
==============================================================================
--- head/sys/vm/vm_meter.c	Mon May 13 15:39:54 2019	(r347531)
+++ head/sys/vm/vm_meter.c	Mon May 13 16:38:48 2019	(r347532)
@@ -97,6 +97,8 @@ struct vmmeter __read_mostly vm_cnt = {
 	.v_wire_count = EARLY_COUNTER,
 };
 
+u_long __exclusive_cache_line vm_user_wire_count;
+
 static void
 vmcounter_startup(void)
 {
@@ -394,6 +396,8 @@ sysctl_handle_vmstat_proc(SYSCTL_HANDLER_ARGS)
 
 #define	VM_STATS_UINT(var, descr)	\
     SYSCTL_UINT(_vm_stats_vm, OID_AUTO, var, CTLFLAG_RD, &vm_cnt.var, 0, descr)
+#define	VM_STATS_ULONG(var, descr)	\
+    SYSCTL_ULONG(_vm_stats_vm, OID_AUTO, var, CTLFLAG_RD, &vm_cnt.var, 0, descr)
 
 VM_STATS_UINT(v_page_size, "Page size in bytes");
 VM_STATS_UINT(v_page_count, "Total number of pages in system");
@@ -410,6 +414,9 @@ VM_STATS_PROC(v_laundry_count, "Pages eligible for lau
 VM_STATS_UINT(v_pageout_free_min, "Min pages reserved for kernel");
 VM_STATS_UINT(v_interrupt_free_min, "Reserved pages for interrupt code");
 VM_STATS_UINT(v_free_severe, "Severe page depletion point");
+
+SYSCTL_ULONG(_vm_stats_vm, OID_AUTO, v_user_wire_count, CTLFLAG_RD,
+    &vm_user_wire_count, 0, "User-wired virtual memory");
 
 #ifdef COMPAT_FREEBSD11
 /*

Modified: head/sys/vm/vm_mmap.c
==============================================================================
--- head/sys/vm/vm_mmap.c	Mon May 13 15:39:54 2019	(r347531)
+++ head/sys/vm/vm_mmap.c	Mon May 13 16:38:48 2019	(r347532)
@@ -1003,7 +1003,7 @@ kern_mlock(struct proc *proc, struct ucred *cred, uint
 	if (last < addr || end < addr)
 		return (EINVAL);
 	npages = atop(end - start);
-	if (npages > vm_page_max_wired)
+	if (npages > vm_page_max_user_wired)
 		return (ENOMEM);
 	map = &proc->p_vmspace->vm_map;
 	PROC_LOCK(proc);
@@ -1013,8 +1013,6 @@ kern_mlock(struct proc *proc, struct ucred *cred, uint
 		return (ENOMEM);
 	}
 	PROC_UNLOCK(proc);
-	if (npages + vm_wire_count() > vm_page_max_wired)
-		return (EAGAIN);
 #ifdef RACCT
 	if (racct_enable) {
 		PROC_LOCK(proc);
@@ -1091,7 +1089,12 @@ sys_mlockall(struct thread *td, struct mlockall_args *
 		 */
 		error = vm_map_wire(map, vm_map_min(map), vm_map_max(map),
 		    VM_MAP_WIRE_USER|VM_MAP_WIRE_HOLESOK);
-		error = (error == KERN_SUCCESS ? 0 : EAGAIN);
+		if (error == KERN_SUCCESS)
+			error = 0;
+		else if (error == KERN_RESOURCE_SHORTAGE)
+			error = ENOMEM;
+		else
+			error = EAGAIN;
 	}
 #ifdef RACCT
 	if (racct_enable && error != KERN_SUCCESS) {
@@ -1558,10 +1561,16 @@ vm_mmap_object(vm_map_t map, vm_offset_t *addr, vm_siz
 		 * If the process has requested that all future mappings
 		 * be wired, then heed this.
 		 */
-		if (map->flags & MAP_WIREFUTURE) {
-			vm_map_wire(map, *addr, *addr + size,
-			    VM_MAP_WIRE_USER | ((flags & MAP_STACK) ?
-			    VM_MAP_WIRE_HOLESOK : VM_MAP_WIRE_NOHOLES));
+		if ((map->flags & MAP_WIREFUTURE) != 0) {
+			vm_map_lock(map);
+			if ((map->flags & MAP_WIREFUTURE) != 0)
+				rv = vm_map_wire_locked(map, *addr,
+				    *addr + size, VM_MAP_WIRE_USER |
+				    ((flags & MAP_STACK) ? VM_MAP_WIRE_HOLESOK :
+				    VM_MAP_WIRE_NOHOLES));
+			if (rv != KERN_SUCCESS)
+				(void)vm_map_delete(map, *addr, *addr + size);
+			vm_map_unlock(map);
 		}
 	}
 	return (vm_mmap_to_errno(rv));

Modified: head/sys/vm/vm_pageout.c
==============================================================================
--- head/sys/vm/vm_pageout.c	Mon May 13 15:39:54 2019	(r347531)
+++ head/sys/vm/vm_pageout.c	Mon May 13 16:38:48 2019	(r347532)
@@ -194,9 +194,10 @@ SYSCTL_UINT(_vm, OID_AUTO, background_launder_max, CTL
 
 int vm_pageout_page_count = 32;
 
-int vm_page_max_wired;		/* XXX max # of wired pages system-wide */
-SYSCTL_INT(_vm, OID_AUTO, max_wired,
-	CTLFLAG_RW, &vm_page_max_wired, 0, "System-wide limit to wired page count");
+u_long vm_page_max_user_wired;
+SYSCTL_ULONG(_vm, OID_AUTO, max_user_wired, CTLFLAG_RW,
+    &vm_page_max_user_wired, 0,
+    "system-wide limit to user-wired page count");
 
 static u_int isqrt(u_int num);
 static int vm_pageout_launder(struct vm_domain *vmd, int launder,
@@ -2041,8 +2042,8 @@ vm_pageout_init(void)
 	if (vm_pageout_update_period == 0)
 		vm_pageout_update_period = 600;
 
-	if (vm_page_max_wired == 0)
-		vm_page_max_wired = freecount / 3;
+	if (vm_page_max_user_wired == 0)
+		vm_page_max_user_wired = freecount / 3;
 }
 
 /*

Modified: head/sys/vm/vm_pageout.h
==============================================================================
--- head/sys/vm/vm_pageout.h	Mon May 13 15:39:54 2019	(r347531)
+++ head/sys/vm/vm_pageout.h	Mon May 13 16:38:48 2019	(r347532)
@@ -75,7 +75,7 @@
  *	Exported data structures.
  */
 
-extern int vm_page_max_wired;
+extern u_long vm_page_max_user_wired;
 extern int vm_pageout_page_count;
 
 #define	VM_OOM_MEM	1

Modified: head/sys/vm/vm_unix.c
==============================================================================
--- head/sys/vm/vm_unix.c	Mon May 13 15:39:54 2019	(r347531)
+++ head/sys/vm/vm_unix.c	Mon May 13 16:38:48 2019	(r347532)
@@ -95,13 +95,11 @@ kern_break(struct thread *td, uintptr_t *addr)
 	rlim_t datalim, lmemlim, vmemlim;
 	int prot, rv;
 	int error = 0;
-	boolean_t do_map_wirefuture;
 
 	datalim = lim_cur(td, RLIMIT_DATA);
 	lmemlim = lim_cur(td, RLIMIT_MEMLOCK);
 	vmemlim = lim_cur(td, RLIMIT_VMEM);
 
-	do_map_wirefuture = FALSE;
 	new = round_page(*addr);
 	vm_map_lock(map);
 
@@ -184,7 +182,14 @@ kern_break(struct thread *td, uintptr_t *addr)
 		if (i386_read_exec && SV_PROC_FLAG(td->td_proc, SV_ILP32))
 			prot |= VM_PROT_EXECUTE;
 #endif
-		rv = vm_map_insert(map, NULL, 0, old, new, prot, VM_PROT_ALL, 0);
+		rv = vm_map_insert(map, NULL, 0, old, new, prot, VM_PROT_ALL,
+		    0);
+		if (rv == KERN_SUCCESS && (map->flags & MAP_WIREFUTURE) != 0) {
+			rv = vm_map_wire_locked(map, old, new,
+			    VM_MAP_WIRE_USER | VM_MAP_WIRE_NOHOLES);
+			if (rv != KERN_SUCCESS)
+				vm_map_delete(map, old, new);
+		}
 		if (rv != KERN_SUCCESS) {
 #ifdef RACCT
 			if (racct_enable) {
@@ -205,17 +210,6 @@ kern_break(struct thread *td, uintptr_t *addr)
 			goto done;
 		}
 		vm->vm_dsize += btoc(new - old);
-		/*
-		 * Handle the MAP_WIREFUTURE case for legacy applications,
-		 * by marking the newly mapped range of pages as wired.
-		 * We are not required to perform a corresponding
-		 * vm_map_unwire() before vm_map_delete() below, as
-		 * it will forcibly unwire the pages in the range.
-		 *
-		 * XXX If the pages cannot be wired, no error is returned.
-		 */
-		if ((map->flags & MAP_WIREFUTURE) == MAP_WIREFUTURE)
-			do_map_wirefuture = TRUE;
 	} else if (new < old) {
 		rv = vm_map_delete(map, new, old);
 		if (rv != KERN_SUCCESS) {
@@ -238,10 +232,6 @@ kern_break(struct thread *td, uintptr_t *addr)
 	}
 done:
 	vm_map_unlock(map);
-
-	if (do_map_wirefuture)
-		(void) vm_map_wire(map, old, new,
-		    VM_MAP_WIRE_USER|VM_MAP_WIRE_NOHOLES);
 
 	if (error == 0)
 		*addr = new;

Modified: head/usr.bin/vmstat/vmstat.c
==============================================================================
--- head/usr.bin/vmstat/vmstat.c	Mon May 13 15:39:54 2019	(r347531)
+++ head/usr.bin/vmstat/vmstat.c	Mon May 13 16:38:48 2019	(r347532)
@@ -156,6 +156,7 @@ static struct __vmmeter {
 	u_int v_free_min;
 	u_int v_free_count;
 	u_int v_wire_count;
+	u_long v_user_wire_count;
 	u_int v_active_count;
 	u_int v_inactive_target;
 	u_int v_inactive_count;
@@ -566,6 +567,7 @@ fill_vmmeter(struct __vmmeter *vmmp)
 		GET_VM_STATS(vm, v_free_min);
 		GET_VM_STATS(vm, v_free_count);
 		GET_VM_STATS(vm, v_wire_count);
+		GET_VM_STATS(vm, v_user_wire_count);
 		GET_VM_STATS(vm, v_active_count);
 		GET_VM_STATS(vm, v_inactive_target);
 		GET_VM_STATS(vm, v_inactive_count);
@@ -1057,6 +1059,8 @@ dosum(void)
 	    sum.v_laundry_count);
 	xo_emit("{:wired-pages/%9u} {N:pages wired down}\n",
 	    sum.v_wire_count);
+	xo_emit("{:virtual-user-wired-pages/%9lu} {N:virtual user pages wired "
+	    "down}\n", sum.v_user_wire_count);
 	xo_emit("{:free-pages/%9u} {N:pages free}\n",
 	    sum.v_free_count);
 	xo_emit("{:bytes-per-page/%9u} {N:bytes per page}\n", sum.v_page_size);


More information about the svn-src-all mailing list