PERFORCE change 34992 for review

Fri Jul 25 16:48:46 GMT 2003

http://perforce.freebsd.org/chv.cgi?CH=34992

Change 34992 by rwatson at rwatson_paprika on 2003/07/25 09:47:47

	First 45 pages of the secarch document; has been following
	me around on my notebook for a while, and it would be a good
	idea to get it in P4 so that when I drop my notebook, it's
	not lost.  The kernel section is doing quite well (VFS and
	networking need work); userspace needs more fleshing out
	generally, especially relating to PAM, NSS, and crypto
	services.

Affected files ...

.. //depot/projects/trustedbsd/doc/en_US.ISO8859-1/books/developers-handbook/secarch/chapter.sgml#2 edit

Differences ...

==== //depot/projects/trustedbsd/doc/en_US.ISO8859-1/books/developers-handbook/secarch/chapter.sgml#2 (text+ko) ====

@@ -1,0 +1,2808 @@
+<!--
+    Copyright (c) 2002, 2003 Networks Associates Technology, Inc.
+    All rights reserved.
+
+    This software was developed for the FreeBSD Project by Network
+    Associates Laboratories, the Security Research Division of Network
+    Associates, Inc. under DARPA/SPAWAR contract N66001-01-C-8035 ("CBOSS"),
+    as part of the DARPA CHATS research program.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+    1. Redistributions of source code must retain the above copyright
+       notice, this list of conditions and the following disclaimer.
+    2. Redistributions in binary form must reproduce the above copyright
+       notice, this list of conditions and the following disclaimer in the
+       documentation and/or other materials provided with the distribution.
+    
+    THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
+    ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+    IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+    ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
+    FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+    DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+    OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+    HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+    LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+    OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+    SUCH DAMAGE.
+    
+    $FreeBSD$
+-->
+
+<chapter id="secarch">
+  <chapterinfo>
+    <authorgroup>
+      <author>
+        <firstname>Robert</firstname>
+        <surname>Watson</surname>
+        
+        <affiliation>
+          <orgname>TrustedBSD Project, Network Associates 
+	    Laboratories</orgname>          
+	  <address><email>rwatson at FreeBSD.org</email></address>
+	</affiliation>
+      </author>
+    </authorgroup>
+  </chapterinfo>
+  
+  <title>FreeBSD Security Architecture</title>
+
+  <sect1 id="secarch-copyright">
+    <title>FreeBSD Security Architecture Copyright</title>
+
+    <para>This software was developed for the FreeBSD Project by Network
+      Associates Laboratories, the Security Research Division of Network
+      Associates, Inc. under DARPA/SPAWAR contract N66001-01-C-8035
+      ("CBOSS"), as part of the DARPA CHATS research program.</para>
+
+    <para>Redistribution and use in source (SGML DocBook) and
+      'compiled' forms (SGML, HTML, PDF, PostScript, RTF and so forth)
+      with or without modification, are permitted provided that the
+      following conditions are met:</para>
+
+    <orderedlist>
+      <listitem>
+        <para>Redistributions of source code (SGML DocBook) must
+          retain the above copyright notice, this list of conditions
+          and the following disclaimer as the first lines of this file
+          unmodified.</para>
+      </listitem>
+
+      <listitem>
+        <para>Redistributions in compiled form (transformed to other
+          DTDs, converted to PDF, PostScript, RTF and other formats)
+          must reproduce the above copyright notice, this list of
+          conditions and the following disclaimer in the documentation
+          and/or other materials provided with the distribution.</para>
+      </listitem>
+    </orderedlist>
+
+    <important>
+      <para>THIS DOCUMENTATION IS PROVIDED BY THE NETWORKS ASSOCIATES
+        TECHNOLOGY, INC "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
+        INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+        MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+        DISCLAIMED. IN NO EVENT SHALL NETWORKS ASSOCIATES TECHNOLOGY,
+        INC BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+        EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+        LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
+        OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+        CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+        STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+        ARISING IN ANY WAY OUT OF THE USE OF THIS DOCUMENTATION, EVEN
+        IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.</para>
+    </important>
+  </sect1>
+  
+  <sect1 id="secarch-synopsis">
+    <title>Synopsis</title>
+
+    <para>The FreeBSD operating system contains a variety of security
+      elements intended to support secure and reliable system operation.
+      These elements include:</para>
+
+      <itemizedlist>
+	<listitem><para>Segmented address space to protect kernel
+	  operation from accidental or malicious user process
+	  interference.</para></listitem>
+	<listitem><para>Inter-process memory protections to limit the
+	  impact of buggy or malicious user applications on other
+	  running applications.</para></listitem>
+	<listitem><para>Association of user credentials with processes,
+	  including user identifiers, group list membership, jail virtual
+	  system, and mandatory access control label, supporting a
+	  multi-user environment.</para></listitem>
+	<listitem><para>Privilege model based on a privileged root user
+	  (uid 0)</para></listitem>
+	<listitem><para>Inter-process controls to prevent improper
+	  interference between processes belonging to different users,
+	  as well as to protect processes that undergo privilege level
+	  changes.</para></listitem>
+	<listitem><para>Discretionary file system protections based on
+	  user/group ownership, file permission mask, and file flags.
+	  Optionally, extended discretionary access control list support
+	  on the UFS and UFS2 file systems.
+	  Special file modes to support uid and gid transition on
+	  execution.</para></listitem>
+	<listitem><para>Mapping of network credentials received via
+	  NFS Remote Procedure Calls (RPCs) to local credentials.
+	  Administrative limits, by network address, to NFS services.
+	  </para></listitem>
+	<listitem><para>Discretionary protections on System V IPC
+	  primitives (shared memory, message queues, and semaphores)
+	  based on ownership and permissions.</para></listitem>
+	<listitem><para>Extensible kernel and user access control
+	  through a pluggable MAC Framework, permitting kernel modules
+	  to bind additional security label data to processes and
+	  system objects, and to enforce discretionary or mandatory
+	  policies, including Biba and LOMAC integrity, MLS
+	  confidentiality, and other augmented system security
+	  policies.</para></listitem>
+	<listitem><para>Pluggable Authentication Module (PAM) support
+	  permitting administrators to require appropriate authentication
+	  in a multi-user environment.
+	  Modules support traditional passwords, one-time passwords,
+	  distributed passwords and authentication services such as
+	  KerberosIV and Kerberos5, and support a variety of hardware
+	  authentication tokens.
+	  In addition, modules authorize login access to the system,
+	  provide for accounting, implement password changing, and
+	  enforce password change policies such as password length
+	  requirements.</para></listitem>
+	<listitem><para>Name Service Switch (NSS) support permitting
+	  a variety of local and distributed directory services to
+	  provide account and authentication data.</para></listitem>
+	<listitem><para>A variety of remote access and management
+	  tools with cryptographic protection of network traffic.
+	  </para></listitem>
+      </itemizedlist>
+
+      <note><para>This revision of the FreeBSD Security Architecture
+	describes the architecture as found in FreeBSD 5.1, and may
+	not accurately describe other versions of the FreeBSD
+	operating system.</para></note>
+  </sect1>
+
+  <sect1 id="secarch-approach">
+    <title>Approach</title>
+
+    <para>As a general-purpose, multi-user operating system, FreeBSD
+      includes a number of security elements intended to form the
+      foundation for a secure application environment.
+      This includes basic system integrity, confidentiality, and
+      availability services.
+      This approach is intended to resist attack in a variety of forms,
+      and against a variety of attack methodologies.
+      In the next section, basic security concepts and assumptions are
+      discussed, including the goals of integrity, confidentiality, and
+      availability as addressed by FreeBSD.</para>
+
+    <para>In general, FreeBSD adopts the same stance as most other
+      operating systems based on the UNIX model: the kernel is isolated
+      from user processes, which represent a variety of programs in
+      execution in isolated address spaces.
+      Processes each carry a process credential, managed by the kernel,
+      describing user and group information for the process, which will
+      be used to authorized access to other kernel objects.
+      Based on the credential and various object properties, several
+      mandatory and discretionary protection models control the
+      interactions between processes, and access by the processes to
+      various system resources (including storage, network
+      communications, etc.)</para>
+
+    <para>As installed, FreeBSD supports easy communication and
+      collaboration between users, while providing the primitives to
+      prevent inappropriate release or modification of data owned by
+      users.
+      Some users are granted special system administion privileges by
+      virtue of being members of specific groups (such as the "operator"
+      and "wheel" groups); in addition, a special administrative account,
+      "root", is used to manage the system.
+      However, the security primitives and configuration are frequently
+      adapted to support much stronger or fine-grained security
+      deployment requirements, including containment of mutually
+      untrusting processes.</para>
+
+    <para>FreeBSD also contains a number of extensions permitting
+      greater flexibility and control, including a system
+      partitioning model widely used by ISPs (jail), support
+      for Mandatory Access Control, and extensible access control
+      policies through the MAC Framework.
+      These mechanisms permit administrators to control the flow of
+      information in systems in a variety of ways, including using the
+      MLS mandatory sensitivity policy, and the Biba integrity policy.
+      These capabilities are similar to those found in many commercial
+      trusted operating systems, and permit FreeBSD to be used 
+      in environments less reminiscent of the time-sharing
+      systems from which the UNIX access control requirements are
+      derived.
+      Making use of these primitives permits the administrator to reduce
+      their level of trust in user accounts on the system, limiting the
+      consequences of compromise of individual user accounts or
+      services.</para>
+
+    <para>In recognition of the importance of networks and network
+      infrastructure, FreeBSD provides a variety of remote login
+      services, as well as advanced cryptographic protocols used to
+      protect the integrity of these services.</para>
+  </sect1>
+
+  <sect1 id="secarch-concepts">
+    <title>Security Architecture Concepts</title>
+
+    <para>FreeBSD is a multi-tasking multi-user operating system, serving
+      in a variety of environments with a variety of security requirements.
+      Common deployment environments include single-user or multi-user
+      workstations, large-scale ISP environments, web or file server
+      clusters, and high-end embedded network appliances including
+      network-attached storage, routers, and firewalls.
+      The FreeBSD operating system combines many of the strongest elements
+      of traditional UNIX security, modern cryptographic services, and
+      trusted operating system elements to support the requirements of
+      these environments through flexibility and adaptability.</para>
+
+    <variablelist>
+      <varlistentry>
+	<term>Authorization</term>
+	<listitem>
+	  <para>Authorization refers to the process by which access control
+	    decisions are made--typically, authorization may be performed
+	    on the basis of an authenticated user identity, presentation
+	    of a cryptographic token, an inherited or acquired capability,
+	    explicit access control lists, or a variety of other policy
+	    driven considerations.
+	    Authorization checks occur in both the kernel and userspace
+	    components of FreeBSD.</para>
+	</listitem>
+      </varlistentry>
+
+      <varlistentry>
+	<term>Authentication</term>
+	<listitem>
+	  <para>Authentication refers to the process by which a system
+	    (in this case, operating system) determines and confirms
+	    the identity of a user (or another system) it interacts
+	    with.
+	    Frequently in the context of FreeBSD, this refers to early
+	    stages in the login process, in which a user presents a
+	    username and password; however, it may also refer to
+	    inter-host authentication using host SSH keys, IKE key
+	    negotiation for IPsec, and a variety of other elements.
+	    Authentication typically relies on testing the knowledge of
+	    a third party in relation to secrets that only an
+	    appropriate third party could know: for example, testing a
+	    shared secret (such as a password), one time passwords, or
+	    through use of a Public Key Infrastructure.
+	    Typically, system authorization occurs in the context of
+	    an authenticated user identity; however, authorization
+	    decisions may be made prior to user authentication, or in
+	    the case of network activity, without access to any
+	    authenticated identity.</para>
+	</listitem>
+      </varlistentry>
+
+      <varlistentry>
+	<term>Availability</term>
+	<listitem>
+	  <para>Availability refers to the design requirement that
+	    services offered by a system be, in as much as is possible,
+	    uninterrupted despite unexpected or undesired circumstances.
+	    In the context of FreeBSD, this concept frequently drives
+	    the requirement for resource accounting, resource limits to
+	    prevent unfair exhaustion of system resources, scheduler
+	    behavior, access controls, and authentication.
+	    Availability is expressed with regard to a subject: frequently,
+	    to maintain availability for one user, it is necessary to
+	    reduce or deny access to services for another user.
+	    Availability is frequently considered in the context of a
+	    malicious user attempting to deny service to other users.
+	    </para>
+	</listitem>
+      </varlistentry>
+
+      <varlistentry>
+	<term>Integrity</term>
+	<listitem>
+	  <para>Integrity refers to the protection of system operation
+	    and stored data from undesired modification by unauthorized
+	    parties; the integrity of the operating system is required
+	    to ensure proper operation.
+	    The integrity of user data is then protected by the operating
+	    system by means of authentication and authorization.
+	    Integrity guarantees are often important to the notion of
+	    system availability, as interference with system integrity
+	    frequently has an impact on operation.
+	    Cryptographic tools may also be used to measure the integrity
+	    of the system.</para>
+	</listitem>
+      </varlistentry>
+
+      <varlistentry>
+	<term>Confidentiality</term>
+	<listitem>
+	  <para>Confidentiality refers to the protection of system and
+	    stored data from undesired leakage to unauthorized
+	    parties: confidentiality of system authentication data
+	    is required to ensure successful authentication, protection
+	    of entropy, etc.
+	    Confidentiality of system and user data is protected by the
+	    operating system by means of authentication and
+	    authorization.
+	    Cryptographic tools may also be used to maintain the secrecy
+	    of data.</para>
+	</listitem>
+      </varlistentry>
+
+      <varlistentry>
+	<term>Cryptography</term>
+	<listitem>
+	  <para>Cryptography refers to the use of mathematical
+	    algorithms and techniques used to provide guarantees for
+	    the protection of data communications and processes.
+	    Typically, cryptographic techniques are used in three
+	    forms in FreeBSD: the protection of authentication data,
+	    protection of data on storage arrays, and for the purposes
+	    of secure network communications; cryptographic services
+	    are also provided for the benefit of applications.
+	    Guarantees of cryptographic algorithms and protocols often
+	    include integrity, confidentiality, non-repudiation, and
+	    freshness.</para>
+	</listitem>
+      </varlistentry>
+    </variablelist>
+
+  </sect1>
+
+  <sect1 id="secarch-kernel">
+    <title>Kernel Security Model</title>
+
+    <para></para>
+
+    <sect2 id="secarch-kernel-addressspace">
+      <title>Kernel Address Space Separation</title>
+
+      <para>FreeBSD, as is the case with most UNIX-derived operating
+	systems, executes the kernel in a supervisor hardware mode
+	that prevents direct process access to kernel memory and
+	hardware resources by user processes, providing integrity and
+	confidentiality protections for kernel data structures and code.
+	On all current hardware architectures, this is accomplished by
+	reserving a segment of the system address space for read/write
+	access only by appropriately authorized task descriptors; access
+        to privileged instructions, such as those used to configure
+        page tables, flush TLBs, and configure new tasks, is limited
+        to code executing kernel mode in the common case.
+	On the i386 platform, this is implemented using rings -- the
+	kernel operates in ring 0, and user processes operate in ring 3.
+	Processes are forced to communicate with the kernel through a
+	variety of more explicit traps, including exceptions generated by
+	arithmetic traps in the instruction stream, exceptional memory
+	accesses such as page faults, or system calls via call gates.
+	Kernel code interacting with user processes is written carefully
+	so as to provide only support only the desired interactions
+	between the kernel and user processes.</para>
+
+      <para>Within the kernel, direct manipulation of user memory contents
+	is generally avoided, and instead abstracted through a series of
+	copy routines that enforce appropriate protections, preventing
+	(among other things) the kernel from derefencing user-provided
+	pointers to kernel address space.
+	In general, direct access to system hardware devices is prohibited
+	to user processes--they must make use of control kernel service
+	APIs to access disk storage, I/O busses, etc.
+	This rule is circumvented under special circumstances, such as the
+	creation of user process device drivers (most frequently, for the
+	X11 window system).
+	To bypass this protection, privilege is required, or must be
+	delegated.</para>
+    </sect2>
+    
+    <sect2 id="secarch-kernel-bypass">
+      <title>Direct Hardware and Kernel Memory Access</title>
+	
+      <para>Some bypass mechanisms are provided to permit privileged user 
+	processes to monitor the kernel or interact with the hardware
+	through controlled abstractions, or in execptional cases,
+	to interact with the hardware in an unabstracted form.</para>
+
+      <para>Most hardware devices are exposed to user applications
+	via the device file system (devfs), which presents these devices
+	as file-like objects.
+	Protection properties (such as access control lists) on the
+	pseudo-files are combined with direct authorization in the device
+	drivers to control access.
+	Two devices, <filename>/dev/mem</filename> and
+	<filename>/dev/kmem</filename> permit unrestricted access to system 
+	memory and kernel memory, and are generally controlled so that only
+	highly privileged processes may use them.
+	The kmem interface was used extensively in earlier versions of
+	FreeBSD and other UNIX-derived systems as a means to monitor kernel
+	activities; in more recent versions of FreeBSD, the &man.sysctl.3;
+	API is used to monitor and manipulate a structure kernel MIB,
+	providing a more controlled interface with defined semantics.
+	The kmem interface may also be used for debugging purposes.</para>
+      
+      <para>On many hardware platforms, it is possible for the kernel to 
+	authorize user processes to perform I/O directly; on the i386
+	platform, opening the <filename>/dev/io</filename> device enables
+	direct I/O access.
+	Other platforms provide similar functionality.
+	Many platforms also offer hardware-specific via the
+	&man.sysarch.2; system call; some of the functions provided by the
+	system call are process-local, but others may provide privileged
+	services.
+	For example, the i386 sysarch() call implements a
+	<literal>I386_SET_IOPERM</literal> operation that also enables
+	direct hardware I/O.
+	Careful maintenance of the protections of these special devices
+	and interfaces is vital to the proper protection of the kernel
+	and user processes via address space protections.</para>
+    </sect2>
+    
+    <sect2 id="secarch-kernel-modules">
+      <title>Kernel Extension via Kernel Modules</title>
+	
+      <para>FreeBSD permits the boot-time and run-time extension of the 
+	operating system kernel through loadable kernel modules.
+	This facility is used to load device drivers supporting new
+	hardware devices on-demand, add support for new file systems, as
+	well as binary emulation layers and other services.
+	Run-time extension of the kernel involves loading of the module
+	from a file into the kernel address space, followed by dynamic
+	linking of that module into the execution environment, and
+	eventual execution of any events and services in the module.
+	As such, the system calls to cause loading or unloading of kernel
+	modules are carefully controlled operations, as loading new code
+	into the kernel could by used by attackers to bypass other
+	protections.
+	Modules may also be loaded as part of the boot process; file
+	system protections are used to protect the integrity of any files 
+	involved in the boot process.</para>
+	
+      <para>Protection of the boot sequence is vital to the secure 
+	operation of the system; this requires protection of any hardware 
+	devices involved in the boot process, as well as any files (such
+	as the boot loader, kernel, modules, and configuration files)
+	from inappropriate access.</para>
+    </sect2>
+  </sect1>
+
+  <sect1 id="secarch-processes">
+    <title>Process Protections</title>
+
+    <para>Processes represent the high-level abstraction of a program
+      "in execution"; each process consists of a virtual memory address
+      space (including a mapping of executable code from a file), signal
+      delivery information, process credentials, one or more threads in
+      execution, pool of resources limits, and an array of file
+      descriptors holding references to a variety of I/O and system
+      objects.
+      Processes generally run "in isolation", communicating with other
+      processes only through explicit and intentional communication
+      channels (such as files, IPC primitives, etc).  Shared memory is
+      permitted, but must be explicitly configured by the process.</para>
+
+    <para>Programs, as executed in processes, are generally derived
+      from an executable image (file) that has been mapped into the
+      process address space.
+      While the kernel does not provide explicit support for it, most
+      applications on FreeBSD make use of dynamically linked libraries,
+      implemented via the memory mapping of library files into the
+      process address space.
+      In practice, most programs executing on FreeBSD systems
+      are composed from a variety of run-time linked libraries, and
+      frequently pluggable modules loaded as shared objects.</para>
+
+    <para>The ability to directly modify process memory or other
+      operation parameters represents the ability to control the
+      execution of the process, and hence manipulate its operation.
+      By providing memory and other protections, the FreeBSD kernel
+      limits inappropriate interference between processes, preventing
+      accidental or intentional leakage of data, damage to data or
+      operational integrity, and leakage of system privilege.
+      System debugging interfaces break down these barries, and must
+      be carefully controlled.</para>
+
+    <sect2 id="secarch-process-credentials">
+      <title>Process Credentials</title>
+
+      <para>FreeBSD assigns each process a credential, which holds
+	a variety of information relating to the privileges available
+	to the process.
+	The process credential contains several elements, including real,
+	saved, and effective user IDs, group IDs, resource limit
+	information for the user, a reference to a system jail, and an
+	extensible MAC label.
+	This data will be used to compute access control results
+	associated with most security-sensitive operations.</para>
+
+      <para>Consistency and performance are provided in the
+	multi-threaded kernel by also assigning a credential to each
+	thread.
+	The thread credential is compared to the primary process
+	credential whenever the thread enters the kernel, and if it
+	differs from the process credential, it is updated to reflect
+	the latest snapshot of the process credential.
+	This ensures a consistent credential for the duration of the
+	system call, and most access control checks for thread
+	operations are performed against the thread credential (a
+	thread-local variable) as it does not require explicit locking.
+	During update operations, the process lock is held across a
+	check and update of the process credential to ensure consistency
+	and prevent races.</para>
+
+      <para>The semantics of the various credential fields are
+	defined both by historical application requirements, and
+	in the POSIX specifications.
+	In general, the effective uid, effective gid, and additional
+	groups will be used to implement access control checks.
+	Processes may use the saved and real uid and gid to preserve
+	other credential elements for conditional use during execution:
+	for example, when a setuid application is executed, the saved
+	uid and gid are updated to the values of the effective uid
+	and gid prior to the execution.
+	This permits setud applications to swap between the original
+	and file-originated ids, permitting the privileges to be
+	blended in a controlled manner.
+	Real and saved uids and gids will also be used in controlling
+	inter-process access control, and will be used under some
+	circumstances to control resource limits.</para>
+
+      <para>Credentials are also cached in a variety of other
+	kernel data structures, generally at the point at which
+	initial access to the object occurs.
+	This caching permits "time of open" UNIX security semantics to
+	be implemented for a several objects, including file descriptors
+	and mountpoints.
+	These credential references are then used to authorize
+	asynchronous write-behind, such as found in NFS.</para>
+
+      <para>As credentials are frequently referenced throughout the
+	system, but rarely modified, credentials are stored as
+	copy-on-write.
+	This permits new read-only references to credential structures
+	to be created with minimal memory overhead.
+	When a credential must be modified, a new copy of the credential
+	is created, modified, and then the process reference is updated
+	to point to the new credential.</para>
+
+      <para>A variety of hazards are associated with dynamic changes
+	in process credentials, as processes may be the object of
+	operations, not just the subject.
+	When a process runs with upgraded or downgraded privileges, risks
+	may exist.
+	For example, even if a process reduces its level of privilege, it
+	may have cached access to objects or memory that is not revoked
+	with the explicit credential change (such as keying material in
+	library-managed or free'd memory).
+	When a process receives upgraded privileges, such as on execution
+	of a setuid binary, the system must revoke access to debug
+	the process by other processes that may already have had
+	debugging sessions opon.</para>
+
+      <para>These protections are introduced in three ways: first,
+	disallowing of operations that may upgrade of process credentials
+	if access to the process cannot be revoked.
+	Second, storage of a "credential change flag", named P_SUGID for
+	historical reasons, which will be used to modify the
+	inter-process access control policy by indicating a change has
+	happened in the process life time.
+	Third, the explicit revocation of existing access to the process.
+	Additional information about inter-process access control may be
+	found later in this chapter.</para>
+
+      <para>In most situations, the user credential structure is
+	sufficient to encapsulate all the necessary subject information
+	required for an access control decision.
+	However, under some circumstances, additional process information
+	may also be used in the decision to exempt closely related
+	processes from certain protections--for example, participation in
+	the same sesion is sufficient to authorize delivery of the
+	"continue" signal between processes, regardless of credentials.
+      </para>
+    </sect2>
+
+    <sect2 id="secarch-privilege-model">
+      <title>Root Privilege Model</title>
+
+      <para>The uid 0, assigned to the root user, is given special
+	privilege to bypass system protections, including most
+	discretionary and mandatory protections on the local
+	system.
+	This privilege is referred to as the "superuser", and is used
+	for system processes, during the boot and shutdown processes,
+	and for management purposes.
+	Because of this concentration of privilege, required to
+	perform a number of system activities, system services
+	running with root privilege are popular targets for attack,
+	as gain access to uid 0 grants access to most other
+	privileges in the system.</para>
+
+      <para>FreeBSD ships with the securelevel protection mechanism,
+	first distributed with BSD 4.4.
+	Securelevels limit the scope of root privilege based on a
+	monotonically increasing current securelevel.
+	As the securelevel increases, various privileges are removed,
+	including the privilege to directly access disk devices, to
+	change file system protection flags, and to modify the
+	firewall configuration.
+	This model does not provide comprehensive protection against
+	the compromise of root privilege, but if properly configured,
+	can be used to improve the safety of the recovery
+	process.</para>
+
+      <para>The privileges of uid 0 are also bounded when used in
+	combination with the jail() security extension, described
+	later in this chapter.</para>
+
+      <para>The TrustedBSD MAC Framework is also capability of
+	limiting certain root privileges, such as the cability to
+	read files based on system labels.
+	The MAC Framework and policies are described later in this
+	chapter.</para>
+    </sect2>
+
+    <sect2 id="secarch-resource-limits">
+      <title>Process Resource Limits</title>
+
+      <para>FreeBSD is fundamentally designed around a resource-sharing
+	model, in which the operating system controls access to a
+	set of real hardware resources and mediates access to ensure
+	consistent and appropriate use.
+	As UNIX-derived systems are frequently deployed in environments
+	in which users or processes contend for resources, a variety of
+	approaches are taken to preventing inappropriate exclusion of
+	other users or processes.
+	This includes scheduler behavior to provide for "fair"
+	distribution of CPU resources between independent processes
+	based on priorities, a file system quota mechanism to bound
+	maximum consumption of resources by user or group, and a set
+	of process resource limits bounding access to a variety of
+	resources at the granularity of uids globally, and process
+	hierarchies.
+	In multiuser environments, such as ISP shell servers, resource
+	limits are vitally important to successful long-term operation:
+	users have a number of unfortunate habits, including the
+	creation of programs that (intentionally or otherwise) attempt
+	to consume all available system resources.</para>
+
+      <sect3 id="secarch-scheduler">
+	<title>System Scheduler Priorities</title>
+
+	<para>FreeBSD 5.1 ships with two system schedulers, one of
+	  which must be selected at kernel compile-time.
+	  The 4.4BSD scheduler implements a time-sharing, floating
+	  priority scheduler based on user-assigned process priorities,
+	  with additional support for real-time and idle scheduling.
+	  The ULE scheduler implements a similar scheduling policy, but
+	  contains optimizations for threading, non-symmetric
+	  CPU topologies, and scheduler structural optimizations for
+	  MP environments.</para>
+
+	<para>The UNIX priority scheme assigns fixed priorities to
+	  kernel and user processes; lower priority values indicate
+	  a higher precedence.
+	  Kernel processes generally take priorities based on
+	  compile-time configuration.
+	  User processes inherit their priority from their parent
+	  process, and priorities may be updated using the
+	  setpriority() system call, which operates on a process,
+	  process group, or all processes owned by a specified user.
+	  In general, privilege is required to lower the priority (raise
+	  the precedence) of a process.
+	  Processes may perform operations to modify the scheduling
+	  properties of another process; policies associated with
+	  these operations are described in the Inter-Process
+	  Authorization section.
+	  Kernel locking primitives make use of priority propagation to
+	  prevent priority inversions on contended kernel
+	  resources.</para>
+
+	<para>In addition to the UNIX process priority ranges, processes
+	  may also operate with realtime or idle priority.
+	  Real time processes preempt processes of lower priorities, and
+	  when contending against equal priority processes, are executed
+	  round-robin.
+	  Idle priority processes operate only when no other processes
+	  are able to execute.
+	  Because both realtime and idle priority processes can result in
+	  priority inversions, privilege is required to modify the
+	  realtime or idle priorities of a process.</para>
+      </sect3>
+
+      <sect3 id="secarch-globalmeasurement">
+	<title>Per-Uid Global Resource Measurement</title>
+
+	<para>FreeBSD permits limits on the number of processes and
+	  amount of socket buffer space, two particularly sensitive
+	  system resources.
+	  Measurements are taken globally on a per-uid basis.
+	  The FreeBSD kernel maintains a global list of both consumed
+	  and maximum resources per-uid in reference-counted uidinfo
+	  structures.
+	  References to the uidinfo structure for the effective and real
+	  uids are cached in the user credential structure, and are
+	  updated when the uid of a process changes.</para>
+
+	<para>Process counts are maintained based on the number of
+	  processes owned by a particular real uid.
+	  Updates to the per-uid process count are performed when the
+	  first process is created, whenever a process forks or exits,
+	  and whenever a process changes its real uid.
+	  Resource limits on process counts are checked only on process
+	  fork; uid change operations will not fail by virtue of a
+	  resource limit.</para>
+
+	<para>Socket buffer sizes are maintained based on the sum of
+	  the high watermark sizes of sockets allocated by a
+	  particular effective uid.
+	  Updates to the per-uid socket buffer count are performed when
+	  a socket is allocated (via the socket() system call or as part
+	  of a new incoming connection), or when data is sent or
+	  received on the socket that may expand the high watermark.
+	  Resource limits are checked only when new sockets are
+	  created.</para>
+      </sect3>
+
+      <sect3 id="secarch-plimits">
+	<title>Per-Process Limits</title>
+
+	<para>Some resource limits, such as number of processes and
+	  maximum socket buffer per uid, are measured globally across
+	  the system.
+	  Other resources, such as VM space consumption or stack size,
+	  are measured locally to the process.
+	  In both cases, the limits imposed are process-local in that
+	  resource limits are a per-process property.
+	  Each process has a reference to a per-process limit structure,
+	  which consists of a set of limits associated with different
+	  resources.
+	  Each limit contains two elements: a soft (current) limit, and
+	  a hard (maximum) limit which represents the greatest value the
+	  current limit may be increased to without privilege.
+	  Process limits are inheritted on fork(); internally, the
+	  limits are stored copy-on-write.
+	  Resource limits are tested on the allocation of the resource,
+	  such as on the allocation of new memory, creation of a socket,
+	  or forking of a process.</para>
+
+	<para>Resources measured and controlled by the plimit structure
+	  include CPU time, maximum file size, maximum address space
+	  "data" size, maximum address space "stack" size, maximum
+	  size of core file, maximum resident set size, maximum
+	  memory pages that may be locked into physical memory, the
+	  maximum number of processes for the real uid, maximum number
+	  of open files, maximum size consumed by socket buffers for
+	  the effective uid, and maximum virtual memory size (including
+	  file mappings).</para>
+      </sect3>
+    </sect2>
+
+    <sect2 id="secarch-interprocess-authorization">
+      <title>Inter-Process Authorization</title>
+
+      <para>Processes interact explicitly through a variety of
+	communication channels, including the file system and IPC
+	services.
+	They may also directly interact through a series of inter-process
+	services.
+	These include signalling (which may act as IPC, modify
+	scheduling, or signal termination), a variety of monitoring
+	mechanisms such as those used to implement &man.ps.1;, scheduler
+	services to modify a process priority or schedule model, and a
+	set of debugging interfaces permitting a process to closely
+	monitor and modify the behavior of another process.
+	Inter-process operations permit the flow of information and
+	control between processes: signals may directly control the
+	operation and behavior of a process; visibility of process
+	command lines share information about what the process is doing,
+	scheduling services may prevent a process from running or cause it
+	to operate improperly; debugging often permits direct control of
+	the process and access to any resources accessible to the target
+	process.</para>
+
+     <para>Protections associated with these services are important to
+	prevent serious security vulnerabilities: in most cases, the
+	protection model requires that, to modify the behavior of another
+	process via signalling, scheduling, or debugging, the subject
+	(initiating) process must have the identical or superset
+	privileges to the object (target) process.
+	Additional protects may be assigned to a process if the process
+	has modified or downgraded is privileges, as it may still have
+	local references to information or resources beyond those
+	normally available to its new privileges; these protections are
+	typically used to prevent attaching a debugger to a process that
+	has run as root, and is now executing as another uid during the
+	login process, or to protect setuid or setgid programs in
+	execution.
+	Vulnerabilities exploitable without these protections might
+	include access to password files or keying material still held
+	by the process.<para>
+
+      <para>An important limitation to the safety of inter-process
+	operations is derived from the UNIX "process id" (pid) model.
+	Each process is assigned a numeric identifier, unique for the
+	lifetime of the process.
+	Operations targetted at processes generally specify the
+	target process by means of its pid: however, pids may (will) be
+	reused following the death of a process.
+	While FreeBSD can be operated in a mode in which pids are
+	randomly allocated, eventual reuse of a pid is guaranteed
+	with any reasonable uptime.
+	As a result, signals may be delivered to improper processes as
+	a result of races in pid use.
+	While inter-process access control prevents the malicious
+	delivery of signals to processes based on differing
+	credentials, it cannot prevent the accidental delivery of a
+	signal to an unintended process by an authorized process.
+	</para>
+
+      <para>MAC Framework policy modules are permitted to augment 
+	inter-process protections, and many do so.</para>
+    </sect2>
+  </sect1>
+
+  <sect1 id="secarch-fileobjects">
+    <title>File Descriptors, File Systems, and Storage Security</title>
+
+    <para></para>
+
+    <sect2 id="secarch-fds">
+      <title>File Descriptors</title>
+
+      <para>Each FreeBSD process has an associated array of references
+	to active I/O objects, known as file descriptors, which
+	refer to active object sessions.
+	For most processes, the file descriptor array will not be
+	shared; however, it is possible to create processes that share
+	file descriptors using the rfork() call--this behavior is
+	required to emulate Linux threading.
+	Each object session, described by a <structname>struct
+	file</structname> in the kernel, has an associated underlying
+	object, operation vector, cached credential from the time of
+	creation, access mode, and current offset.</para>
+
+      <para>Object sessions are initially referred to by one file
+	descriptor, but references may be duplicated to additional file
+	descriptors, as well as inheritted across fork() operations, and
+	passed to other processes using UNIX Domain Socket ancillary right
+	transfer.
+	In FreeBSD 5.1, objects referenced by file descriptors are: IPC
+	pipes, IPC sockets, vnodes (files, directories, device nodes,
+	POSIX fifos, etc), kqueues (kernel event notification queues).
+	References to object sessions remain until the the descriptor is
+	explicitly closed via the close() or rfork() system calls, or
+	implicitly closed on process exec() or exit().
+	File descriptor arrays may not be modified except by processes
+	that reference them--however, the active object sessions may be
+	modified, as may the underlying objects.
+	File descriptor properties, such as offset and active access
+	flags, may be explicitly modified using system calls such as
+	seek() or fcntl(), or implicitly as a result of operations making
+	use of the file descriptor, such as read() or write().</para>
+
+      <para>In most cases, accesses made using a file descriptor are
+	authorized using the cached file credential from the creation
+	of the file descriptor: for example, NFS read operations
+	initiated as a result of a read on the file descriptor will
+	be authorized using the credential that opened the file,
+	which may be different from the credential used to initiate
+	the read operation.
+	However, some accesses are authorized using the active
+	credential: typically, this includes meta-data and
+	administrative operations such as fchmod() on files, or ioctl()
+	on sockets.
+	Mandatory protections enforced by the MAC Framework may depend
+	on either the active or file credential.
+	For local file systems, protections are typically only enforced
+	with the open() operation, with the exception of
+	securelevel-related flags such as system immutable.</para>
+    </sect2>
+
+    <sect2 id="secarch-filesystem">
+      <title>File System Protection Model</title>
+
+      <sect3 id="secarch-fsnamespace">
+	<title>File System Namespace Protections</title>
+
+	<para>The FreeBSD virtual file system (VFS) consists of
+	  a namespace constructed out of individual file system
+	  mounts, and a set of objects (including directories and
+	  files) that make up the namespace. 
+	  Processes look up objects relative to either the process
+	  root directory, or process current working directory; in
+	  general, objects cannot be accessed without passing
+	  through the file system namespace; objects that cannot
+	  be named by a process are, in most cases, inaccessible to
+	  the process, although references to unnameable objects
+	  outside a process's namespace may be passed via IPC.</para>
+
+	<para>As a result of these properties, three common protections
+	  are available to protect objects in the file system
+	  namespace: chroot() or mountpoint covering may be used
+	  to prevent the process from constructing a name to the
+	  object; namespace protections, such as access control
+	  lists on directories, can prevent a process from traversing
+	  the namespace to an object; finally, protections on the
+	  object itself can prevent inappropriate access to an
+	  object--for example, file permissions may permit read of
+	  an object, but not write.
+	  Objects may appear more than once in the same namespace by
+	  virtue of hard links and synthetic mountpoints: as a result,
+	  caution must be applied when relying on namespace-based
+	  protections to limit access to an object.</para>
+
+	<para>Modifications to the namespace may be performed by
+	  adding or removing file system mounts, attaching, overlaying,
+	  or detaching parts of the namespace, or by modifying elements
+	  in the namespace by perform operations on objects in the
+	  namespace.
+	  Mount and unmount operations require privilege in FreeBSD
+	  by default; however, the system policy may be configured to
+	  permit user mounts under limited circumstances.
+	  Access to the mount primitives is generally limited because
+	  the ability to directly access the underlying storage
+	  mechanism connotes the ability to manipulate the file system
+	  namespace and protections, bypassing OS limits on those
+	  operations.
+	  When user mounts are permitted, the underlying device must
+	  be readable (and optionally writable) by the user performing
+	  the mount operation; in addition, users may only mount
+	  new file systems on top of objects (typically, directories)
+	  that they own.
+	  Modifications to elements in a file system namespace are
+	  typically authorized in a file system-specific manner, based
+	  on the mandatory and discretionary protections offered by
+	  the file system.</para>
+
+	<para>The MAC Framework permits file system operations to
+	  be controlled across all file systems, with protections
+	  implemented above the per-filesystem layer.
+	  File systems may support multi-label operation, assigning
+	  labels to each object in the file system in a file-system
+	  specific manner, or single-label operation in which all
+	  objects in the file system share the same label, requiring
+	  no special support by the file system for MAC.</para>
+
+      <sect3 id="secarch-fsobjects">
+	<title>File System Objects and Operations</title>
+
+	<para>The FreeBSD VFS defines several classes of objects, and
+	  operations that apply to one or more of those objects.
+	  The following operations may be supported on a virtual file
+	  node:</para>
+
+	<variablelist>
+	  <varlistentry>
+	    <term>create()</term>
+	    <listitem>
+	      <para>Create a new file system object; parent directory,

>>> TRUNCATED FOR MAIL (1000 lines) <<<
To Unsubscribe: send mail to majordomo at trustedbsd.org
with "unsubscribe trustedbsd-cvs" in the body of the message