git: 6d3c78d97028 - main - Rewrite the vm_page_alloc manual page

From: Mark Johnston <markj_at_FreeBSD.org>
Date: Wed, 20 Oct 2021 01:23:22 UTC
The branch main has been updated by markj:

URL: https://cgit.FreeBSD.org/src/commit/?id=6d3c78d970283dc5e64eaf10a4ae40b54613d608

commit 6d3c78d970283dc5e64eaf10a4ae40b54613d608
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-10-20 00:26:30 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-10-20 01:22:56 +0000

    Rewrite the vm_page_alloc manual page
    
    Document the new allocator variants and flesh out the description of
    some details of the page allocator interface.
    
    Reviewed by:    kib, alc
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D32035
---
 share/man/man9/Makefile        |  11 ++
 share/man/man9/vm_page_alloc.9 | 339 ++++++++++++++++++++++++++++++++++-------
 2 files changed, 292 insertions(+), 58 deletions(-)

diff --git a/share/man/man9/Makefile b/share/man/man9/Makefile
index fdaa14bc93e4..eb4465259226 100644
--- a/share/man/man9/Makefile
+++ b/share/man/man9/Makefile
@@ -2330,6 +2330,17 @@ MLINKS+=vm_map_stack.9 vm_map_growstack.9
 MLINKS+=vm_map_wire.9 vm_map_wire_mapped.9 \
 	vm_page_wire.9 vm_page_unwire.9 \
 	vm_page_wire.9 vm_page_unwire_noq.9
+MLINKS+=vm_page_alloc.9 vm_page_alloc_after.9 \
+	vm_page_alloc.9 vm_page_alloc_contig.9 \
+	vm_page_alloc.9 vm_page_alloc_contig_domain.9 \
+	vm_page_alloc.9 vm_page_alloc_domain.9 \
+	vm_page_alloc.9 vm_page_alloc_domain_after.9 \
+	vm_page_alloc.9 vm_page_alloc_freelist.9 \
+	vm_page_alloc.9 vm_page_alloc_freelist_domain.9 \
+	vm_page_alloc.9 vm_page_alloc_noobj.9 \
+	vm_page_alloc.9 vm_page_alloc_noobj_contig.9 \
+	vm_page_alloc.9 vm_page_alloc_noobj_contig_domain.9 \
+	vm_page_alloc.9 vm_page_alloc_noobj_domain.9
 MLINKS+=vm_page_bits.9 vm_page_clear_dirty.9 \
 	vm_page_bits.9 vm_page_dirty.9 \
 	vm_page_bits.9 vm_page_is_valid.9 \
diff --git a/share/man/man9/vm_page_alloc.9 b/share/man/man9/vm_page_alloc.9
index aa3854b47aea..1b587339b0cd 100644
--- a/share/man/man9/vm_page_alloc.9
+++ b/share/man/man9/vm_page_alloc.9
@@ -1,5 +1,9 @@
 .\"
 .\" Copyright (C) 2001 Chad David <davidc@acns.ab.ca>. All rights reserved.
+.\" Copyright (c) 2021 The FreeBSD Foundation
+.\"
+.\" Portions of this documentation were written by Mark Johnston under
+.\" sponsorship from the FreeBSD Foundation.
 .\"
 .\" Redistribution and use in source and binary forms, with or without
 .\" modification, are permitted provided that the following conditions
@@ -26,102 +30,321 @@
 .\"
 .\" $FreeBSD$
 .\"
-.Dd November 16, 2016
+.Dd October 17, 2021
 .Dt VM_PAGE_ALLOC 9
 .Os
 .Sh NAME
 .Nm vm_page_alloc
-.Nd "allocate a page for a"
-.Vt vm_object
+.Nd "allocate a page of memory"
 .Sh SYNOPSIS
 .In sys/param.h
 .In vm/vm.h
 .In vm/vm_page.h
 .Ft vm_page_t
 .Fn vm_page_alloc "vm_object_t object" "vm_pindex_t pindex" "int req"
+.Ft vm_page_t
+.Fo vm_page_alloc_after
+.Fa "vm_object_t object"
+.Fa "vm_pindex_t pindex"
+.Fa "int req"
+.Fa "vm_page_t mpred"
+.Fc
+.Ft vm_page_t
+.Fo vm_page_alloc_contig
+.Fa "vm_object_t object"
+.Fa "vm_pindex_t pindex"
+.Fa "int req"
+.Fa "u_long npages"
+.Fa "vm_paddr_t low"
+.Fa "vm_paddr_t high"
+.Fa "u_long alignment"
+.Fa "vm_paddr_t boundary"
+.Fa "vm_memattr_t memattr"
+.Fc
+.Ft vm_page_t
+.Fo vm_page_alloc_contig_domain
+.Fa "vm_object_t object"
+.Fa "vm_pindex_t pindex"
+.Fa "int req"
+.Fa "u_long npages"
+.Fa "vm_paddr_t low"
+.Fa "vm_paddr_t high"
+.Fa "u_long alignment"
+.Fa "vm_paddr_t boundary"
+.Fa "vm_memattr_t memattr"
+.Fc
+.Ft vm_page_t
+.Fo vm_page_alloc_domain
+.Fa "vm_object_t object"
+.Fa "vm_pindex_t pindex"
+.Fa "int domain"
+.Fa "int req"
+.Fc
+.Ft vm_page_t
+.Fo vm_page_alloc_domain_after
+.Fa "vm_object_t object"
+.Fa "vm_pindex_t pindex"
+.Fa "int domain"
+.Fa "int req"
+.Fa "vm_page_t mpred"
+.Fc
+.Ft vm_page_t
+.Fo vm_page_alloc_freelist
+.Fa "int freelist"
+.Fa "int req"
+.Fc
+.Ft vm_page_t
+.Fo vm_page_alloc_freelist_domain
+.Fa "int domain"
+.Fa "int freelist"
+.Fa "int req"
+.Fc
+.Ft vm_page_t
+.Fo vm_page_alloc_noobj
+.Fa "int req"
+.Fc
+.Ft vm_page_t
+.Fo vm_page_alloc_noobj_contig
+.Fa "int req"
+.Fa "u_long npages"
+.Fa "vm_paddr_t low"
+.Fa "vm_paddr_t high"
+.Fa "u_long alignment"
+.Fa "vm_paddr_t boundary"
+.Fa "vm_memattr_t memattr"
+.Fc
+.Ft vm_page_t
+.Fo vm_page_alloc_noobj_contig_domain
+.Fa "int domain"
+.Fa "int req"
+.Fa "u_long npages"
+.Fa "vm_paddr_t low"
+.Fa "vm_paddr_t high"
+.Fa "u_long alignment"
+.Fa "vm_paddr_t boundary"
+.Fa "vm_memattr_t memattr"
+.Fc
+.Ft vm_page_t
+.Fo vm_page_alloc_noobj_domain
+.Fa "int domain"
+.Fa "int req"
+.Fc
 .Sh DESCRIPTION
 The
 .Fn vm_page_alloc
-function allocates a page at
+family of functions allocate one or more pages of physical memory.
+Most kernel code should not call these functions directly but should instead
+use a kernel memory allocator such as
+.Xr malloc 9
+or
+.Xr uma 9 ,
+or should use a higher-level interface to the page cache, such as
+.Xr vm_page_grab 9 .
+.Pp
+All of the functions take a
+.Fa req
+parameter which encodes the allocation priority and optional modifier flags,
+described below.
+The functions whose names do not include
+.Dq noobj
+additionally insert the pages starting at index
 .Fa pindex
-within
+in the
+VM object
 .Fa object .
-It is assumed that a page has not already been allocated at
-.Fa pindex .
-The page returned is inserted into the object, unless
-.Dv VM_ALLOC_NOOBJ
-is specified in the
-.Fa req .
+The object must be write-locked and not have a page already resident at the
+specified index.
+The functions whose names include
+.Dq domain
+support NUMA-aware allocation by returning pages from the
+.Xr numa 4
+domain specified by
+.Fa domain .
 .Pp
+The
+.Fn vm_page_alloc_after
+and
+.Fn vm_page_alloc_domain_after
+functions behave identically to
 .Fn vm_page_alloc
-will not sleep.
+and
+.Fn vm_page_alloc_domain ,
+respectively, except that they take an additional parameter
+.Fa mpred
+which must be the page resident in
+.Fa object
+with largest index smaller than
+.Fa pindex ,
+or
+.Dv NULL
+if no such page exists.
+These functions exist to optimize the common case of loops that allocate
+multiple pages at successive indices within an object.
 .Pp
-Its arguments are:
-.Bl -tag -width ".Fa object"
-.It Fa object
-The VM object to allocate the page for.
 The
-.Fa object
-must be locked if
-.Dv VM_ALLOC_NOOBJ
-is not specified.
-.It Fa pindex
-The index into the object at which the page should be inserted.
-.It Fa req
-The bitwise-inclusive OR of a class and any optional flags indicating
-how the page should be allocated.
+.Fn vm_page_alloc_contig
+and
+.Fn vm_page_alloc_noobj_contig
+functions and their NUMA-aware variants allocate a physically contiguous run of
+.Fa npages
+pages which satisfies the specified constraints.
+The
+.Fa low
+and
+.Fa high
+parameters specify a physical address range from which the run is to
+be allocated.
+The
+.Fa alignment
+parameter specifies the requested alignment of the first page in the run
+and must be a power of two.
+If the
+.Fa boundary
+parameter is non-zero, the pages constituting the run will not cross a
+physical address that is a multiple of the parameter value, which must be a
+power of two.
+If
+.Fa memattr
+is not equal to
+.Dv VM_MEMATTR_DEFAULT ,
+then mappings of the returned pages created by, e.g.,
+.Xr pmap_enter 9
+or
+.Xr pmap_qenter 9 ,
+will carry the machine-dependent encoding of the memory attribute.
+Additionally, the direct mapping of the page, if any, will be updated to
+reflect the requested memory attribute.
+.Pp
+The
+.Fn vm_page_alloc_freelist
+and
+.Fn vm_page_alloc_freelist_domain
+functions behave identically to
+.Fn vm_page_alloc_noobj
+and
+.Fn vm_page_alloc_noobj_domain ,
+respectively, except that a successful allocation will return a page from the
+specified physical memory freelist.
+These functions are not intended for use outside of the virtual memory
+subsystem and exist only to support the requirements of certain platforms.
+.Sh REQUEST FLAGS
+All page allocator functions accept a
+.Fa req
+parameter that governs certain aspects of the function's behavior.
+.Pp
+The
+.Dv VM_ALLOC_WAITOK ,
+.Dv VM_ALLOC_WAITFAIL ,
+and
+.Dv VM_ALLOC_NOWAIT
+flags specify the behavior of the allocator if free pages could not be
+immediately allocated.
+The
+.Dv VM_ALLOC_WAITOK
+flag can only be used with the
+.Dq noobj
+variants.
+If
+.Dv VM_ALLOC_NOWAIT
+is specified, then the allocator gives up and returns
+.Dv NULL .
+.Dv VM_ALLOC_NOWAIT
+is specified implicitly if none of the flags are present in the request.
+If either
+.Dv VM_ALLOC_WAITOK
+or
+.Dv VM_ALLOC_WAITFAIL
+is specified, the allocator will put the calling thread to sleep until
+sufficient free pages become available.
+At this point, if
+.Dv VM_ALLOC_WAITFAIL
+is specified the allocator will return
+.Dv NULL ,
+and if
+.Dv VM_ALLOC_WAITOK
+is specified the allocator will retry the allocation.
+After a failed
+.Dv VM_ALLOC_WAITFAIL
+allocation returns, the VM object, if any, will have been unlocked while the
+thread was sleeping.
+In this case the VM object write lock will be re-acquired before the function
+call returns.
 .Pp
-Exactly one of the following classes must be specified:
+.Fa req
+also encodes the allocation request priority.
+By default the page(s) are allocated with no special treatment.
+If the number of available free pages is below a certain watermark, the
+allocation will fail or the allocating thread will sleep, depending on
+the specified wait flag.
+The watermark is computed at boot time and corresponds to a small (less than
+one percent) fraction of the system's total physical memory.
+To allocate memory more aggressively, one of following flags may be specified.
 .Bl -tag -width ".Dv VM_ALLOC_INTERRUPT"
-.It Dv VM_ALLOC_NORMAL
-The page should be allocated with no special treatment.
 .It Dv VM_ALLOC_SYSTEM
-The page can be allocated if the cache is empty and the free
-page count is above the interrupt reserved water mark.
+The page can be allocated if the free page count is above the interrupt
+reserved water mark.
 This flag should be used only when the system really needs the page.
 .It Dv VM_ALLOC_INTERRUPT
-.Fn vm_page_alloc
-is being called during an interrupt.
-A page will be returned successfully if the free page count is greater
-than zero.
+The allocation will fail only if zero free pages are available.
+This flag should be used only if the consequences of an allocation failure
+are worse than leaving the system without free memory.
+For example, this flag is used when allocating kernel page table pages, where
+allocation failures trigger a kernel panic.
 .El
 .Pp
-The optional flags are:
+The following optional flags can further modify allocator behavior:
 .Bl -tag -width ".Dv VM_ALLOC_NOBUSY"
+.It Dv VM_ALLOC_SBUSY
+The returned page will be shared-busy.
+This flag may only be specified when allocating pages in a VM object.
 .It Dv VM_ALLOC_NOBUSY
-The returned page will not be exclusive busy.
+The returned page will not be busy.
+This flag is implicit when allocating pages without a VM object.
+When allocating pages in a VM object, and neither
+.Dv VM_ALLOC_SBUSY
+nor
+.Dv VM_ALLOC_NOBUSY
+are specified, the returned pages will be exclusively busied.
 .It Dv VM_ALLOC_NODUMP
 The returned page will not be included in any kernel core dumps
 regardless of whether or not it is mapped in to KVA.
-.It Dv VM_ALLOC_NOOBJ
-Do not associate the allocated page with a vm object.
-The
-.Fa object
-argument is ignored.
-.It Dv VM_ALLOC_SBUSY
-The returned page will be shared busy.
 .It Dv VM_ALLOC_WIRED
 The returned page will be wired.
 .It Dv VM_ALLOC_ZERO
-Indicate a preference for a pre-zeroed page.
-There is no guarantee that the returned page will be zeroed, but it
-will have the
-.Dv PG_ZERO
-flag set if it is zeroed.
-.El
+If this flag is specified, the
+.Dq noobj
+variants will return zeroed pages.
+The other allocator interfaces ignore this flag.
+.It Dv VM_ALLOC_COUNT(n)
+Hint that at least
+.Fa n
+pages will be allocated by the caller in the near future.
+.Fa n
+must be no larger than 65535.
+If the system is short of free pages, this hint may cause the kernel
+to reclaim memory more aggressively than it would otherwise.
 .El
 .Sh RETURN VALUES
-The
-.Vt vm_page_t
-that was allocated is returned if successful; otherwise,
+If the allocation was successful, a pointer to the
+.Vt struct vm_page
+corresponding to the allocated page is returned.
+If the allocation request specified multiple pages, the returned
+pointer points to an array of
+.Vt struct vm_page
+constituting the run.
+Upon failure,
 .Dv NULL
 is returned.
-.Sh NOTES
-The pager process is always upgraded to
-.Dv VM_ALLOC_SYSTEM
-unless
-.Dv VM_ALLOC_INTERRUPT
-is set.
+Regardless of whether the allocation succeeds or fails, the VM
+object
+.Fa object
+will be write-locked upon return.
+.Sh SEE ALSO
+.Xr numa 4 ,
+.Xr malloc 9 ,
+.Xr uma 9 ,
+.Xr vm_page_grab 9 ,
+.Xr vm_page_sbusy 9
 .Sh AUTHORS
 This manual page was written by
 .An Chad David Aq Mt davidc@acns.ab.ca .