svn commit: r347532 - in head: contrib/netbsd-tests/lib/libc/sys lib/libc/sys lib/libc/tests/sys sys/amd64/vmm sys/sys sys/vm usr.bin/vmstat

Rodney W. Grimes freebsd at gndrsh.dnsmgr.net
Mon May 13 17:35:13 UTC 2019


> Author: markj
> Date: Mon May 13 16:38:48 2019
> New Revision: 347532
> URL: https://svnweb.freebsd.org/changeset/base/347532
> 
> Log:
>   Provide separate accounting for user-wired pages.
>   
>   Historically we have not distinguished between kernel wirings and user
>   wirings for accounting purposes.  User wirings (via mlock(2)) were
>   subject to a global limit on the number of wired pages, so if large
>   swaths of physical memory were wired by the kernel, as happens with
>   the ZFS ARC among other things, the limit could be exceeded, causing
>   user wirings to fail.
>   
>   The change adds a new counter, v_user_wire_count, which counts the
>   number of virtual pages wired by user processes via mlock(2) and
>   mlockall(2).  Only user-wired pages are subject to the system-wide
>   limit which helps provide some safety against deadlocks.  In
>   particular, while sources of kernel wirings typically support some
>   backpressure mechanism, there is no way to reclaim user-wired pages
>   shorting of killing the wiring process.  The limit is exported as
>   vm.max_user_wired, renamed from vm.max_wired, and changed from u_int
>   to u_long.
>   
>   The choice to count virtual user-wired pages rather than physical
>   pages was done for simplicity.  There are mechanisms that can cause
>   user-wired mappings to be destroyed while maintaining a wiring of
>   the backing physical page; these make it difficult to accurately
>   track user wirings at the physical page layer.
>   
>   The change also closes some holes which allowed user wirings to succeed
>   even when they would cause the system limit to be exceeded.  For
>   instance, mmap() may now fail with ENOMEM in a process that has called
>   mlockall(MCL_FUTURE) if the new mapping would cause the user wiring
>   limit to be exceeded.
>   
>   Note that bhyve -S is subject to the user wiring limit, which defaults
>   to 1/3 of physical RAM.  Users that wish to exceed the limit must tune
>   vm.max_user_wired.

Because of that this should probably have a:
Release Notes: 	Yes

>   Reviewed by:	kib, ngie (mlock() test changes)
>   Tested by:	pho (earlier version)
>   MFC after:	45 days
>   Sponsored by:	Netflix
>   Differential Revision:	https://reviews.freebsd.org/D19908
> 
> Modified:
>   head/contrib/netbsd-tests/lib/libc/sys/t_mlock.c
>   head/lib/libc/sys/mlock.2
>   head/lib/libc/sys/mlockall.2
>   head/lib/libc/tests/sys/mlock_helper.c
>   head/sys/amd64/vmm/vmm.c
>   head/sys/sys/vmmeter.h
>   head/sys/vm/vm_glue.c
>   head/sys/vm/vm_map.c
>   head/sys/vm/vm_map.h
>   head/sys/vm/vm_meter.c
>   head/sys/vm/vm_mmap.c
>   head/sys/vm/vm_pageout.c
>   head/sys/vm/vm_pageout.h
>   head/sys/vm/vm_unix.c
>   head/usr.bin/vmstat/vmstat.c
> 
> Modified: head/contrib/netbsd-tests/lib/libc/sys/t_mlock.c
> ==============================================================================
> --- head/contrib/netbsd-tests/lib/libc/sys/t_mlock.c	Mon May 13 15:39:54 2019	(r347531)
> +++ head/contrib/netbsd-tests/lib/libc/sys/t_mlock.c	Mon May 13 16:38:48 2019	(r347532)
> @@ -51,7 +51,7 @@ __RCSID("$NetBSD: t_mlock.c,v 1.6 2016/08/09 12:02:44 
>  #define _KMEMUSER
>  #include <machine/vmparam.h>
>  
> -void set_vm_max_wired(int);
> +void set_vm_max_wired(u_long);
>  void restore_vm_max_wired(void);
>  #endif
>  
> 
> Modified: head/lib/libc/sys/mlock.2
> ==============================================================================
> --- head/lib/libc/sys/mlock.2	Mon May 13 15:39:54 2019	(r347531)
> +++ head/lib/libc/sys/mlock.2	Mon May 13 16:38:48 2019	(r347532)
> @@ -28,7 +28,7 @@
>  .\"	@(#)mlock.2	8.2 (Berkeley) 12/11/93
>  .\" $FreeBSD$
>  .\"
> -.Dd March 20, 2018
> +.Dd May 13, 2019
>  .Dt MLOCK 2
>  .Os
>  .Sh NAME
> @@ -97,13 +97,13 @@ resource limit and the
>  system-wide
>  .Dq wired pages
>  limit
> -.Va vm.max_wired .
> -.Va vm.max_wired
> +.Va vm.max_user_wired .
> +.Va vm.max_user_wired
>  applies to the system as a whole, so the amount available to a single
>  process at any given time is the difference between
> -.Va vm.max_wired
> +.Va vm.max_user_wired
>  and
> -.Va vm.stats.vm.v_wire_count .
> +.Va vm.stats.vm.v_user_wire_count .
>  .Pp
>  If
>  .Va security.bsd.unprivileged_mlock
> @@ -124,13 +124,11 @@ will fail if:
>  is set to 0 and the caller is not the super-user.
>  .It Bq Er EINVAL
>  The address range given wraps around zero.
> -.It Bq Er EAGAIN
> -Locking the indicated range would exceed the system limit for locked memory.
>  .It Bq Er ENOMEM
>  Some portion of the indicated address range is not allocated.
>  There was an error faulting/mapping a page.
> -Locking the indicated range would exceed the per-process limit for locked
> -memory.
> +Locking the indicated range would exceed the per-process or system-wide limits
> +for locked memory.
>  .El
>  The
>  .Fn munlock
> @@ -171,11 +169,11 @@ system calls first appeared in
>  Allocating too much wired memory can lead to a memory-allocation deadlock
>  which requires a reboot to recover from.
>  .Pp
> -The per-process resource limit is a limit on the amount of virtual
> -memory locked, while the system-wide limit is for the number of locked
> -physical pages.
> -Hence a process with two distinct locked mappings of the same physical page
> -counts as 2 pages against the per-process limit and as only a single page
> -in the system limit.
> +The per-process and system-wide resource limits of locked memory apply
> +to the amount of virtual memory locked, not the amount of locked physical
> +pages.
> +Hence two distinct locked mappings of the same physical page counts as
> +2 pages aginst the system limit, and also against the per-process limit
> +if both mappings belong to the same physical map.
>  .Pp
>  The per-process resource limit is not currently supported.
> 
> Modified: head/lib/libc/sys/mlockall.2
> ==============================================================================
> --- head/lib/libc/sys/mlockall.2	Mon May 13 15:39:54 2019	(r347531)
> +++ head/lib/libc/sys/mlockall.2	Mon May 13 16:38:48 2019	(r347532)
> @@ -30,7 +30,7 @@
>  .\"
>  .\" $FreeBSD$
>  .\"
> -.Dd December 25, 2012
> +.Dd May 13, 2019
>  .Dt MLOCKALL 2
>  .Os
>  .Sh NAME
> @@ -69,7 +69,7 @@ limited in how much they can lock down.
>  A single process can lock the minimum of a system-wide
>  .Dq wired pages
>  limit
> -.Va vm.max_wired
> +.Va vm.max_user_wired
>  and the per-process
>  .Dv RLIMIT_MEMLOCK
>  resource limit.
> @@ -138,9 +138,9 @@ and
>  functions first appeared in
>  .Fx 5.1 .
>  .Sh BUGS
> -The per-process resource limit is a limit on the amount of virtual
> -memory locked, while the system-wide limit is for the number of locked
> -physical pages.
> -Hence a process with two distinct locked mappings of the same physical page
> -counts as 2 pages against the per-process limit and as only a single page
> -in the system limit.
> +The per-process and system-wide resource limits of locked memory apply
> +to the amount of virtual memory locked, not the amount of locked physical
> +pages.
> +Hence two distinct locked mappings of the same physical page counts as
> +2 pages aginst the system limit, and also against the per-process limit
> +if both mappings belong to the same physical map.
> 
> Modified: head/lib/libc/tests/sys/mlock_helper.c
> ==============================================================================
> --- head/lib/libc/tests/sys/mlock_helper.c	Mon May 13 15:39:54 2019	(r347531)
> +++ head/lib/libc/tests/sys/mlock_helper.c	Mon May 13 16:38:48 2019	(r347532)
> @@ -39,16 +39,16 @@ __FBSDID("$FreeBSD$");
>  #include <limits.h>
>  #include <stdio.h>
>  
> -#define	VM_MAX_WIRED "vm.max_wired"
> +#define	VM_MAX_WIRED "vm.max_user_wired"
>  
>  static void
> -vm_max_wired_sysctl(int *old_value, int *new_value)
> +vm_max_wired_sysctl(u_long *old_value, u_long *new_value)
>  {
>  	size_t old_len;
> -	size_t new_len = (new_value == NULL ? 0 : sizeof(int));
> +	size_t new_len = (new_value == NULL ? 0 : sizeof(*new_value));
>  
>  	if (old_value == NULL)
> -		printf("Setting the new value to %d\n", *new_value);
> +		printf("Setting the new value to %lu\n", *new_value);
>  	else {
>  		ATF_REQUIRE_MSG(sysctlbyname(VM_MAX_WIRED, NULL, &old_len,
>  		    new_value, new_len) == 0,
> @@ -60,14 +60,14 @@ vm_max_wired_sysctl(int *old_value, int *new_value)
>  	    "sysctlbyname(%s) failed: %s", VM_MAX_WIRED, strerror(errno));
>  
>  	if (old_value != NULL)
> -		printf("Saved the old value (%d)\n", *old_value);
> +		printf("Saved the old value (%lu)\n", *old_value);
>  }
>  
>  void
> -set_vm_max_wired(int new_value)
> +set_vm_max_wired(u_long new_value)
>  {
>  	FILE *fp;
> -	int old_value;
> +	u_long old_value;
>  
>  	fp = fopen(VM_MAX_WIRED, "w");
>  	if (fp == NULL) {
> @@ -78,7 +78,7 @@ set_vm_max_wired(int new_value)
>  
>  	vm_max_wired_sysctl(&old_value, NULL);
>  
> -	ATF_REQUIRE_MSG(fprintf(fp, "%d", old_value) > 0,
> +	ATF_REQUIRE_MSG(fprintf(fp, "%lu", old_value) > 0,
>  	    "saving %s failed", VM_MAX_WIRED);
>  
>  	fclose(fp);
> @@ -90,7 +90,7 @@ void
>  restore_vm_max_wired(void)
>  {
>  	FILE *fp;
> -	int saved_max_wired;
> +	u_long saved_max_wired;
>  
>  	fp = fopen(VM_MAX_WIRED, "r");
>  	if (fp == NULL) {
> @@ -98,14 +98,14 @@ restore_vm_max_wired(void)
>  		return;
>  	}
>  
> -	if (fscanf(fp, "%d", &saved_max_wired) != 1) {
> +	if (fscanf(fp, "%lu", &saved_max_wired) != 1) {
>  		perror("fscanf failed\n");
>  		fclose(fp);
>  		return;
>  	}
>  
>  	fclose(fp);
> -	printf("old value in %s: %d\n", VM_MAX_WIRED, saved_max_wired);
> +	printf("old value in %s: %lu\n", VM_MAX_WIRED, saved_max_wired);
>  
>  	if (saved_max_wired == 0) /* This will cripple the test host */
>  		return;
> 
> Modified: head/sys/amd64/vmm/vmm.c
> ==============================================================================
> --- head/sys/amd64/vmm/vmm.c	Mon May 13 15:39:54 2019	(r347531)
> +++ head/sys/amd64/vmm/vmm.c	Mon May 13 16:38:48 2019	(r347532)
> @@ -754,7 +754,8 @@ vm_mmap_memseg(struct vm *vm, vm_paddr_t gpa, int segi
>  		    VM_MAP_WIRE_USER | VM_MAP_WIRE_NOHOLES);
>  		if (error != KERN_SUCCESS) {
>  			vm_map_remove(&vm->vmspace->vm_map, gpa, gpa + len);
> -			return (EFAULT);
> +			return (error == KERN_RESOURCE_SHORTAGE ? ENOMEM :
> +			    EFAULT);
>  		}
>  	}
>  
> 
> Modified: head/sys/sys/vmmeter.h
> ==============================================================================
> --- head/sys/sys/vmmeter.h	Mon May 13 15:39:54 2019	(r347531)
> +++ head/sys/sys/vmmeter.h	Mon May 13 16:38:48 2019	(r347532)
> @@ -153,6 +153,8 @@ extern domainset_t vm_severe_domains;
>  #define	VM_CNT_INC(var)		VM_CNT_ADD(var, 1)
>  #define	VM_CNT_FETCH(var)	counter_u64_fetch(vm_cnt.var)
>  
> +extern u_long vm_user_wire_count;
> +
>  static inline void
>  vm_wire_add(int cnt)
>  {
> 
> Modified: head/sys/vm/vm_glue.c
> ==============================================================================
> --- head/sys/vm/vm_glue.c	Mon May 13 15:39:54 2019	(r347531)
> +++ head/sys/vm/vm_glue.c	Mon May 13 16:38:48 2019	(r347532)
> @@ -181,21 +181,8 @@ vslock(void *addr, size_t len)
>  	if (last < (vm_offset_t)addr || end < (vm_offset_t)addr)
>  		return (EINVAL);
>  	npages = atop(end - start);
> -	if (npages > vm_page_max_wired)
> +	if (npages > vm_page_max_user_wired)
>  		return (ENOMEM);
> -#if 0
> -	/*
> -	 * XXX - not yet
> -	 *
> -	 * The limit for transient usage of wired pages should be
> -	 * larger than for "permanent" wired pages (mlock()).
> -	 *
> -	 * Also, the sysctl code, which is the only present user
> -	 * of vslock(), does a hard loop on EAGAIN.
> -	 */
> -	if (npages + vm_wire_count() > vm_page_max_wired)
> -		return (EAGAIN);
> -#endif
>  	error = vm_map_wire(&curproc->p_vmspace->vm_map, start, end,
>  	    VM_MAP_WIRE_SYSTEM | VM_MAP_WIRE_NOHOLES);
>  	if (error == KERN_SUCCESS) {
> 
> Modified: head/sys/vm/vm_map.c
> ==============================================================================
> --- head/sys/vm/vm_map.c	Mon May 13 15:39:54 2019	(r347531)
> +++ head/sys/vm/vm_map.c	Mon May 13 16:38:48 2019	(r347532)
> @@ -90,6 +90,7 @@ __FBSDID("$FreeBSD$");
>  #include <vm/pmap.h>
>  #include <vm/vm_map.h>
>  #include <vm/vm_page.h>
> +#include <vm/vm_pageout.h>
>  #include <vm/vm_object.h>
>  #include <vm/vm_pager.h>
>  #include <vm/vm_kern.h>
> @@ -2917,12 +2918,12 @@ done:
>  
>  		if (rv == KERN_SUCCESS && (!user_unwire ||
>  		    (entry->eflags & MAP_ENTRY_USER_WIRED))) {
> -			if (user_unwire)
> -				entry->eflags &= ~MAP_ENTRY_USER_WIRED;
>  			if (entry->wired_count == 1)
>  				vm_map_entry_unwire(map, entry);
>  			else
>  				entry->wired_count--;
> +			if (user_unwire)
> +				entry->eflags &= ~MAP_ENTRY_USER_WIRED;
>  		}
>  		KASSERT((entry->eflags & MAP_ENTRY_IN_TRANSITION) != 0,
>  		    ("vm_map_unwire: in-transition flag missing %p", entry));
> @@ -2942,6 +2943,28 @@ done:
>  	return (rv);
>  }
>  
> +static void
> +vm_map_wire_user_count_sub(u_long npages)
> +{
> +
> +	atomic_subtract_long(&vm_user_wire_count, npages);
> +}
> +
> +static bool
> +vm_map_wire_user_count_add(u_long npages)
> +{
> +	u_long wired;
> +
> +	wired = vm_user_wire_count;
> +	do {
> +		if (npages + wired > vm_page_max_user_wired)
> +			return (false);
> +	} while (!atomic_fcmpset_long(&vm_user_wire_count, &wired,
> +	    npages + wired));
> +
> +	return (true);
> +}
> +
>  /*
>   *	vm_map_wire_entry_failure:
>   *
> @@ -2978,37 +3001,49 @@ vm_map_wire_entry_failure(vm_map_t map, vm_map_entry_t
>  	entry->wired_count = -1;
>  }
>  
> +int
> +vm_map_wire(vm_map_t map, vm_offset_t start, vm_offset_t end, int flags)
> +{
> +	int rv;
> +
> +	vm_map_lock(map);
> +	rv = vm_map_wire_locked(map, start, end, flags);
> +	vm_map_unlock(map);
> +	return (rv);
> +}
> +
> +
>  /*
> - *	vm_map_wire:
> + *	vm_map_wire_locked:
>   *
> - *	Implements both kernel and user wiring.
> + *	Implements both kernel and user wiring.  Returns with the map locked,
> + *	the map lock may be dropped.
>   */
>  int
> -vm_map_wire(vm_map_t map, vm_offset_t start, vm_offset_t end,
> -    int flags)
> +vm_map_wire_locked(vm_map_t map, vm_offset_t start, vm_offset_t end, int flags)
>  {
>  	vm_map_entry_t entry, first_entry, tmp_entry;
>  	vm_offset_t faddr, saved_end, saved_start;
> -	unsigned int last_timestamp;
> +	u_long npages;
> +	u_int last_timestamp;
>  	int rv;
>  	boolean_t need_wakeup, result, user_wire;
>  	vm_prot_t prot;
>  
> +	VM_MAP_ASSERT_LOCKED(map);
> +
>  	if (start == end)
>  		return (KERN_SUCCESS);
>  	prot = 0;
>  	if (flags & VM_MAP_WIRE_WRITE)
>  		prot |= VM_PROT_WRITE;
>  	user_wire = (flags & VM_MAP_WIRE_USER) ? TRUE : FALSE;
> -	vm_map_lock(map);
>  	VM_MAP_RANGE_CHECK(map, start, end);
>  	if (!vm_map_lookup_entry(map, start, &first_entry)) {
>  		if (flags & VM_MAP_WIRE_HOLESOK)
>  			first_entry = first_entry->next;
> -		else {
> -			vm_map_unlock(map);
> +		else
>  			return (KERN_INVALID_ADDRESS);
> -		}
>  	}
>  	last_timestamp = map->timestamp;
>  	entry = first_entry;
> @@ -3042,7 +3077,6 @@ vm_map_wire(vm_map_t map, vm_offset_t start, vm_offset
>  							/*
>  							 * first_entry has been deleted.
>  							 */
> -							vm_map_unlock(map);
>  							return (KERN_INVALID_ADDRESS);
>  						}
>  						end = saved_start;
> @@ -3082,13 +3116,22 @@ vm_map_wire(vm_map_t map, vm_offset_t start, vm_offset
>  		}
>  		if (entry->wired_count == 0) {
>  			entry->wired_count++;
> -			saved_start = entry->start;
> -			saved_end = entry->end;
>  
> +			npages = atop(entry->end - entry->start);
> +			if (user_wire && !vm_map_wire_user_count_add(npages)) {
> +				vm_map_wire_entry_failure(map, entry,
> +				    entry->start);
> +				end = entry->end;
> +				rv = KERN_RESOURCE_SHORTAGE;
> +				goto done;
> +			}
> +
>  			/*
>  			 * Release the map lock, relying on the in-transition
>  			 * mark.  Mark the map busy for fork.
>  			 */
> +			saved_start = entry->start;
> +			saved_end = entry->end;
>  			vm_map_busy(map);
>  			vm_map_unlock(map);
>  
> @@ -3136,6 +3179,8 @@ vm_map_wire(vm_map_t map, vm_offset_t start, vm_offset
>  			last_timestamp = map->timestamp;
>  			if (rv != KERN_SUCCESS) {
>  				vm_map_wire_entry_failure(map, entry, faddr);
> +				if (user_wire)
> +					vm_map_wire_user_count_sub(npages);
>  				end = entry->end;
>  				goto done;
>  			}
> @@ -3201,9 +3246,12 @@ done:
>  			 * Undo the wiring.  Wiring succeeded on this entry
>  			 * but failed on a later entry.  
>  			 */
> -			if (entry->wired_count == 1)
> +			if (entry->wired_count == 1) {
>  				vm_map_entry_unwire(map, entry);
> -			else
> +				if (user_wire)
> +					vm_map_wire_user_count_sub(
> +					    atop(entry->end - entry->start));
> +			} else
>  				entry->wired_count--;
>  		}
>  	next_entry_done:
> @@ -3220,7 +3268,6 @@ done:
>  		}
>  		vm_map_simplify_entry(map, entry);
>  	}
> -	vm_map_unlock(map);
>  	if (need_wakeup)
>  		vm_map_wakeup(map);
>  	return (rv);
> @@ -3338,13 +3385,18 @@ vm_map_sync(
>  static void
>  vm_map_entry_unwire(vm_map_t map, vm_map_entry_t entry)
>  {
> +	vm_size_t size;
>  
>  	VM_MAP_ASSERT_LOCKED(map);
>  	KASSERT(entry->wired_count > 0,
>  	    ("vm_map_entry_unwire: entry %p isn't wired", entry));
> +
> +	size = entry->end - entry->start;
> +	if ((entry->eflags & MAP_ENTRY_USER_WIRED) != 0)
> +		vm_map_wire_user_count_sub(atop(size));
>  	pmap_unwire(map->pmap, entry->start, entry->end);
> -	vm_object_unwire(entry->object.vm_object, entry->offset, entry->end -
> -	    entry->start, PQ_ACTIVE);
> +	vm_object_unwire(entry->object.vm_object, entry->offset, size,
> +	    PQ_ACTIVE);
>  	entry->wired_count = 0;
>  }
>  
> @@ -4311,12 +4363,11 @@ retry:
>  	 * Heed the MAP_WIREFUTURE flag if it was set for this process.
>  	 */
>  	if (rv == KERN_SUCCESS && (map->flags & MAP_WIREFUTURE) != 0) {
> -		vm_map_unlock(map);
> -		vm_map_wire(map, grow_start, grow_start + grow_amount,
> +		rv = vm_map_wire_locked(map, grow_start,
> +		    grow_start + grow_amount,
>  		    VM_MAP_WIRE_USER | VM_MAP_WIRE_NOHOLES);
> -		vm_map_lock_read(map);
> -	} else
> -		vm_map_lock_downgrade(map);
> +	}
> +	vm_map_lock_downgrade(map);
>  
>  out:
>  #ifdef RACCT
> 
> Modified: head/sys/vm/vm_map.h
> ==============================================================================
> --- head/sys/vm/vm_map.h	Mon May 13 15:39:54 2019	(r347531)
> +++ head/sys/vm/vm_map.h	Mon May 13 16:38:48 2019	(r347532)
> @@ -422,7 +422,8 @@ int vm_map_madvise (vm_map_t, vm_offset_t, vm_offset_t
>  int vm_map_stack (vm_map_t, vm_offset_t, vm_size_t, vm_prot_t, vm_prot_t, int);
>  int vm_map_unwire(vm_map_t map, vm_offset_t start, vm_offset_t end,
>      int flags);
> -int vm_map_wire(vm_map_t map, vm_offset_t start, vm_offset_t end,
> +int vm_map_wire(vm_map_t map, vm_offset_t start, vm_offset_t end, int flags);
> +int vm_map_wire_locked(vm_map_t map, vm_offset_t start, vm_offset_t end,
>      int flags);
>  long vmspace_swap_count(struct vmspace *vmspace);
>  void vm_map_entry_set_vnode_text(vm_map_entry_t entry, bool add);
> 
> Modified: head/sys/vm/vm_meter.c
> ==============================================================================
> --- head/sys/vm/vm_meter.c	Mon May 13 15:39:54 2019	(r347531)
> +++ head/sys/vm/vm_meter.c	Mon May 13 16:38:48 2019	(r347532)
> @@ -97,6 +97,8 @@ struct vmmeter __read_mostly vm_cnt = {
>  	.v_wire_count = EARLY_COUNTER,
>  };
>  
> +u_long __exclusive_cache_line vm_user_wire_count;
> +
>  static void
>  vmcounter_startup(void)
>  {
> @@ -394,6 +396,8 @@ sysctl_handle_vmstat_proc(SYSCTL_HANDLER_ARGS)
>  
>  #define	VM_STATS_UINT(var, descr)	\
>      SYSCTL_UINT(_vm_stats_vm, OID_AUTO, var, CTLFLAG_RD, &vm_cnt.var, 0, descr)
> +#define	VM_STATS_ULONG(var, descr)	\
> +    SYSCTL_ULONG(_vm_stats_vm, OID_AUTO, var, CTLFLAG_RD, &vm_cnt.var, 0, descr)
>  
>  VM_STATS_UINT(v_page_size, "Page size in bytes");
>  VM_STATS_UINT(v_page_count, "Total number of pages in system");
> @@ -410,6 +414,9 @@ VM_STATS_PROC(v_laundry_count, "Pages eligible for lau
>  VM_STATS_UINT(v_pageout_free_min, "Min pages reserved for kernel");
>  VM_STATS_UINT(v_interrupt_free_min, "Reserved pages for interrupt code");
>  VM_STATS_UINT(v_free_severe, "Severe page depletion point");
> +
> +SYSCTL_ULONG(_vm_stats_vm, OID_AUTO, v_user_wire_count, CTLFLAG_RD,
> +    &vm_user_wire_count, 0, "User-wired virtual memory");
>  
>  #ifdef COMPAT_FREEBSD11
>  /*
> 
> Modified: head/sys/vm/vm_mmap.c
> ==============================================================================
> --- head/sys/vm/vm_mmap.c	Mon May 13 15:39:54 2019	(r347531)
> +++ head/sys/vm/vm_mmap.c	Mon May 13 16:38:48 2019	(r347532)
> @@ -1003,7 +1003,7 @@ kern_mlock(struct proc *proc, struct ucred *cred, uint
>  	if (last < addr || end < addr)
>  		return (EINVAL);
>  	npages = atop(end - start);
> -	if (npages > vm_page_max_wired)
> +	if (npages > vm_page_max_user_wired)
>  		return (ENOMEM);
>  	map = &proc->p_vmspace->vm_map;
>  	PROC_LOCK(proc);
> @@ -1013,8 +1013,6 @@ kern_mlock(struct proc *proc, struct ucred *cred, uint
>  		return (ENOMEM);
>  	}
>  	PROC_UNLOCK(proc);
> -	if (npages + vm_wire_count() > vm_page_max_wired)
> -		return (EAGAIN);
>  #ifdef RACCT
>  	if (racct_enable) {
>  		PROC_LOCK(proc);
> @@ -1091,7 +1089,12 @@ sys_mlockall(struct thread *td, struct mlockall_args *
>  		 */
>  		error = vm_map_wire(map, vm_map_min(map), vm_map_max(map),
>  		    VM_MAP_WIRE_USER|VM_MAP_WIRE_HOLESOK);
> -		error = (error == KERN_SUCCESS ? 0 : EAGAIN);
> +		if (error == KERN_SUCCESS)
> +			error = 0;
> +		else if (error == KERN_RESOURCE_SHORTAGE)
> +			error = ENOMEM;
> +		else
> +			error = EAGAIN;
>  	}
>  #ifdef RACCT
>  	if (racct_enable && error != KERN_SUCCESS) {
> @@ -1558,10 +1561,16 @@ vm_mmap_object(vm_map_t map, vm_offset_t *addr, vm_siz
>  		 * If the process has requested that all future mappings
>  		 * be wired, then heed this.
>  		 */
> -		if (map->flags & MAP_WIREFUTURE) {
> -			vm_map_wire(map, *addr, *addr + size,
> -			    VM_MAP_WIRE_USER | ((flags & MAP_STACK) ?
> -			    VM_MAP_WIRE_HOLESOK : VM_MAP_WIRE_NOHOLES));
> +		if ((map->flags & MAP_WIREFUTURE) != 0) {
> +			vm_map_lock(map);
> +			if ((map->flags & MAP_WIREFUTURE) != 0)
> +				rv = vm_map_wire_locked(map, *addr,
> +				    *addr + size, VM_MAP_WIRE_USER |
> +				    ((flags & MAP_STACK) ? VM_MAP_WIRE_HOLESOK :
> +				    VM_MAP_WIRE_NOHOLES));
> +			if (rv != KERN_SUCCESS)
> +				(void)vm_map_delete(map, *addr, *addr + size);
> +			vm_map_unlock(map);
>  		}
>  	}
>  	return (vm_mmap_to_errno(rv));
> 
> Modified: head/sys/vm/vm_pageout.c
> ==============================================================================
> --- head/sys/vm/vm_pageout.c	Mon May 13 15:39:54 2019	(r347531)
> +++ head/sys/vm/vm_pageout.c	Mon May 13 16:38:48 2019	(r347532)
> @@ -194,9 +194,10 @@ SYSCTL_UINT(_vm, OID_AUTO, background_launder_max, CTL
>  
>  int vm_pageout_page_count = 32;
>  
> -int vm_page_max_wired;		/* XXX max # of wired pages system-wide */
> -SYSCTL_INT(_vm, OID_AUTO, max_wired,
> -	CTLFLAG_RW, &vm_page_max_wired, 0, "System-wide limit to wired page count");
> +u_long vm_page_max_user_wired;
> +SYSCTL_ULONG(_vm, OID_AUTO, max_user_wired, CTLFLAG_RW,
> +    &vm_page_max_user_wired, 0,
> +    "system-wide limit to user-wired page count");
>  
>  static u_int isqrt(u_int num);
>  static int vm_pageout_launder(struct vm_domain *vmd, int launder,
> @@ -2041,8 +2042,8 @@ vm_pageout_init(void)
>  	if (vm_pageout_update_period == 0)
>  		vm_pageout_update_period = 600;
>  
> -	if (vm_page_max_wired == 0)
> -		vm_page_max_wired = freecount / 3;
> +	if (vm_page_max_user_wired == 0)
> +		vm_page_max_user_wired = freecount / 3;
>  }
>  
>  /*
> 
> Modified: head/sys/vm/vm_pageout.h
> ==============================================================================
> --- head/sys/vm/vm_pageout.h	Mon May 13 15:39:54 2019	(r347531)
> +++ head/sys/vm/vm_pageout.h	Mon May 13 16:38:48 2019	(r347532)
> @@ -75,7 +75,7 @@
>   *	Exported data structures.
>   */
>  
> -extern int vm_page_max_wired;
> +extern u_long vm_page_max_user_wired;
>  extern int vm_pageout_page_count;
>  
>  #define	VM_OOM_MEM	1
> 
> Modified: head/sys/vm/vm_unix.c
> ==============================================================================
> --- head/sys/vm/vm_unix.c	Mon May 13 15:39:54 2019	(r347531)
> +++ head/sys/vm/vm_unix.c	Mon May 13 16:38:48 2019	(r347532)
> @@ -95,13 +95,11 @@ kern_break(struct thread *td, uintptr_t *addr)
>  	rlim_t datalim, lmemlim, vmemlim;
>  	int prot, rv;
>  	int error = 0;
> -	boolean_t do_map_wirefuture;
>  
>  	datalim = lim_cur(td, RLIMIT_DATA);
>  	lmemlim = lim_cur(td, RLIMIT_MEMLOCK);
>  	vmemlim = lim_cur(td, RLIMIT_VMEM);
>  
> -	do_map_wirefuture = FALSE;
>  	new = round_page(*addr);
>  	vm_map_lock(map);
>  
> @@ -184,7 +182,14 @@ kern_break(struct thread *td, uintptr_t *addr)
>  		if (i386_read_exec && SV_PROC_FLAG(td->td_proc, SV_ILP32))
>  			prot |= VM_PROT_EXECUTE;
>  #endif
> -		rv = vm_map_insert(map, NULL, 0, old, new, prot, VM_PROT_ALL, 0);
> +		rv = vm_map_insert(map, NULL, 0, old, new, prot, VM_PROT_ALL,
> +		    0);
> +		if (rv == KERN_SUCCESS && (map->flags & MAP_WIREFUTURE) != 0) {
> +			rv = vm_map_wire_locked(map, old, new,
> +			    VM_MAP_WIRE_USER | VM_MAP_WIRE_NOHOLES);
> +			if (rv != KERN_SUCCESS)
> +				vm_map_delete(map, old, new);
> +		}
>  		if (rv != KERN_SUCCESS) {
>  #ifdef RACCT
>  			if (racct_enable) {
> @@ -205,17 +210,6 @@ kern_break(struct thread *td, uintptr_t *addr)
>  			goto done;
>  		}
>  		vm->vm_dsize += btoc(new - old);
> -		/*
> -		 * Handle the MAP_WIREFUTURE case for legacy applications,
> -		 * by marking the newly mapped range of pages as wired.
> -		 * We are not required to perform a corresponding
> -		 * vm_map_unwire() before vm_map_delete() below, as
> -		 * it will forcibly unwire the pages in the range.
> -		 *
> -		 * XXX If the pages cannot be wired, no error is returned.
> -		 */
> -		if ((map->flags & MAP_WIREFUTURE) == MAP_WIREFUTURE)
> -			do_map_wirefuture = TRUE;
>  	} else if (new < old) {
>  		rv = vm_map_delete(map, new, old);
>  		if (rv != KERN_SUCCESS) {
> @@ -238,10 +232,6 @@ kern_break(struct thread *td, uintptr_t *addr)
>  	}
>  done:
>  	vm_map_unlock(map);
> -
> -	if (do_map_wirefuture)
> -		(void) vm_map_wire(map, old, new,
> -		    VM_MAP_WIRE_USER|VM_MAP_WIRE_NOHOLES);
>  
>  	if (error == 0)
>  		*addr = new;
> 
> Modified: head/usr.bin/vmstat/vmstat.c
> ==============================================================================
> --- head/usr.bin/vmstat/vmstat.c	Mon May 13 15:39:54 2019	(r347531)
> +++ head/usr.bin/vmstat/vmstat.c	Mon May 13 16:38:48 2019	(r347532)
> @@ -156,6 +156,7 @@ static struct __vmmeter {
>  	u_int v_free_min;
>  	u_int v_free_count;
>  	u_int v_wire_count;
> +	u_long v_user_wire_count;
>  	u_int v_active_count;
>  	u_int v_inactive_target;
>  	u_int v_inactive_count;
> @@ -566,6 +567,7 @@ fill_vmmeter(struct __vmmeter *vmmp)
>  		GET_VM_STATS(vm, v_free_min);
>  		GET_VM_STATS(vm, v_free_count);
>  		GET_VM_STATS(vm, v_wire_count);
> +		GET_VM_STATS(vm, v_user_wire_count);
>  		GET_VM_STATS(vm, v_active_count);
>  		GET_VM_STATS(vm, v_inactive_target);
>  		GET_VM_STATS(vm, v_inactive_count);
> @@ -1057,6 +1059,8 @@ dosum(void)
>  	    sum.v_laundry_count);
>  	xo_emit("{:wired-pages/%9u} {N:pages wired down}\n",
>  	    sum.v_wire_count);
> +	xo_emit("{:virtual-user-wired-pages/%9lu} {N:virtual user pages wired "
> +	    "down}\n", sum.v_user_wire_count);
>  	xo_emit("{:free-pages/%9u} {N:pages free}\n",
>  	    sum.v_free_count);
>  	xo_emit("{:bytes-per-page/%9u} {N:bytes per page}\n", sum.v_page_size);
> 
> 

-- 
Rod Grimes                                                 rgrimes at freebsd.org


More information about the svn-src-head mailing list