svn commit: r43645 - head/en_US.ISO8859-1/books/arch-handbook/boot
    Warren Block 
    wblock at FreeBSD.org
       
    Sun Jan 26 02:30:35 UTC 2014
    
    
  
Author: wblock
Date: Sun Jan 26 02:30:34 2014
New Revision: 43645
URL: http://svnweb.freebsd.org/changeset/doc/43645
Log:
  Rewrite of portions of the Boot chapter by Sergio Andrés Gómez del Real.
  Committed version is a modified version of the one submitted with the
  patch. Thanks to Sergio Andrés Gómez del Real for the submission, to
  John-Mark Gurney for technical review, and to both for their patience.
  
  PR:		docs/185780
  Submitted by:	Sergio Andrés Gómez del Real <Sergio.G.DelReal at gmail.com>
  Reviewed by:	jmg
Modified:
  head/en_US.ISO8859-1/books/arch-handbook/boot/chapter.xml
Modified: head/en_US.ISO8859-1/books/arch-handbook/boot/chapter.xml
==============================================================================
--- head/en_US.ISO8859-1/books/arch-handbook/boot/chapter.xml	Sun Jan 26 00:10:46 2014	(r43644)
+++ head/en_US.ISO8859-1/books/arch-handbook/boot/chapter.xml	Sun Jan 26 02:30:34 2014	(r43645)
@@ -4,6 +4,8 @@ The FreeBSD Documentation Project
 
 Copyright (c) 2002 Sergey Lyubka <devnull at uptsoft.com>
 All rights reserved
+Copyright (c) 2014 Sergio Andr?s G?mez del Real <Sergio.G.delReal at gmail.com>
+All rights reserved
 $FreeBSD$
 -->
 
@@ -25,6 +27,18 @@ $FreeBSD$
       </author>
       <!-- devnull at uptsoft.com  12 Jun 2002 -->
     </authorgroup>
+
+    <authorgroup>
+      <author>
+	<personname>
+	  <firstname>Sergio Andrés</firstname>
+	  <surname> Gómez del Real</surname>
+	</personname>
+
+	<contrib>Updated and enhanced by </contrib>
+      </author>
+      <!-- Sergio.G.DelReal at gmail.com  Jan 2014 -->
+    </authorgroup>
   </info>
 
   <sect1 xml:id="boot-synopsis">
@@ -37,88 +51,103 @@ $FreeBSD$
     <indexterm><primary>booting</primary></indexterm>
     <indexterm><primary>system initialization</primary></indexterm>
     <para>This chapter is an overview of the boot and system
-      initialization process, starting from the BIOS (firmware) POST,
-      to the first user process creation.  Since the initial steps of
-      system startup are very architecture dependent, the IA-32
-      architecture is used as an example.</para>
+      initialization processes, starting from the <acronym>BIOS</acronym> (firmware)
+      <acronym>POST</acronym>, to the first user process creation.  Since the initial
+      steps of system startup are very architecture dependent, the
+      IA-32 architecture is used as an example.</para>
+
+    <para>The &os; boot process can be surprisingly complex.  After
+      control is passed from the <acronym>BIOS</acronym>, a considerable amount of
+      low-level configuration must be done before the kernel can be
+      loaded and executed.  This setup must be done in a simple and
+      flexible manner, allowing the user a great deal of customization
+      possibilities.</para>
   </sect1>
 
   <sect1 xml:id="boot-overview">
     <title>Overview</title>
 
-    <para>A computer running FreeBSD can boot by several methods,
-      although the most common method, booting from a harddisk where
-      the OS is installed, will be discussed here.  The boot process
-      is divided into several steps:</para>
-
-    <itemizedlist>
-      <listitem><para>BIOS POST</para></listitem>
-      <listitem><para><literal>boot0</literal> stage</para></listitem>
-      <listitem><para><literal>boot2</literal> stage</para></listitem>
-      <listitem><para>loader stage</para></listitem>
-      <listitem><para>kernel initialization</para></listitem>
-    </itemizedlist>
+    <para>The boot process is an extremely machine-dependent
+      activity.  Not only must code be written for every computer
+      architecture, but there may also be multiple types of booting on
+      the same architecture.  For example, looking at
+      <filename class="directory">/usr/sys/src/boot</filename>
+      reveals a great amount of architecture-dependent code.  There is
+      a directory for each of the various supported architectures.  In
+      the x86-specific <filename class="directory">i386</filename>
+      directory, there are subdirectories for different boot standards
+      like <filename>mbr</filename> (Master Boot Record),
+      <filename>gpt</filename> (<acronym>GUID</acronym> Partition
+      Table), and <filename>efi</filename> (Extensible Firmware
+      Interface).  Each boot standard has its own conventions and data
+      structures.  The example that follows shows booting an x86
+      computer from an <acronym>MBR</acronym> hard drive with the &os;
+      <filename>boot0</filename> multi-boot loader stored in the very
+      first sector.  That boot code starts the &os; three-stage boot
+      process.</para>
+
+    <para>The key to understanding this process is that it is a series
+      of stages of increasing complexity.  These stages are
+      <filename>boot1</filename>, <filename>boot2</filename>, and
+      <filename>loader</filename> (see &man.boot.8; for more detail).
+      The boot system executes each stage in sequence.  The last
+      stage, <filename>loader</filename>, is responsible for loading
+      the &os; kernel.  Each stage is examined in the following
+      sections.</para>
 
-    <indexterm><primary>BIOS POST</primary></indexterm>
-    <indexterm><primary>boot0</primary></indexterm>
-    <indexterm><primary>boot2</primary></indexterm>
-    <indexterm><primary>loader</primary></indexterm>
-    <para>The <literal>boot0</literal> and <literal>boot2</literal>
-      stages are also referred to as <emphasis>bootstrap stages 1 and
-      2</emphasis> in &man.boot.8; as the first steps in FreeBSD's
-      3-stage bootstrapping procedure.  Various information is printed
-      on the screen at each stage, so you may visually recognize them
-      using the table that follows.  Please note that the actual data
+    <para>Here is an example of the output generated by the
+      different boot stages.  Actual output
       may differ from machine to machine:</para>
 
     <informaltable frame="none" pgwide="0">
       <tgroup cols="2">
 	<tbody>
 	  <row>
-	    <entry><para>Output (may vary)</para></entry>
-	    <entry><para>BIOS (firmware) messages</para></entry>
+	    <entry>&os; Component</entry>
+	    <entry>Output (may vary)</entry>
 	  </row>
 
 	  <row>
-	    <entry><para><screen>F1    FreeBSD
+	    <entry><literal>boot0</literal></entry>
+	    <entry><screen>F1    FreeBSD
 F2    BSD
-F5    Disk 2</screen></para></entry>
-	    <entry><para><literal>boot0</literal></para></entry>
+F5    Disk 2</screen></entry>
 	  </row>
 
 	  <row>
-	    <entry><para><screen>>>FreeBSD/i386 BOOT
-Default: 1:ad(1,a)/boot/loader
-boot:</screen></para></entry>
-	    <entry><para><literal>boot2</literal>
+	    <entry><literal>boot2</literal>
 		<footnote><para>This prompt will appear if the user
 		    presses a key just after selecting an OS to boot
 		    at the <literal>boot0</literal>
-		    stage.</para></footnote></para></entry>
+		    stage.</para></footnote></entry>
+	    <entry><screen>>>FreeBSD/i386 BOOT
+Default: 1:ad(1,a)/boot/loader
+boot:</screen></entry>
 	  </row>
 
 	  <row>
-	    <entry><para><screen>BTX loader 1.0 BTX version is 1.01
-BIOS drive A: is disk0
-BIOS drive C: is disk1
-BIOS 639kB/64512kB available memory
-FreeBSD/i386 bootstrap loader, Revision 0.8
+	    <entry><filename>loader</filename></entry>
+	    <entry><screen>BTX loader 1.00 BTX version is 1.02
+Consoles: internal video/keyboard
+BIOS drive C: is disk0
+BIOS 639kB/2096064kB available memory
+
+FreeBSD/x86 bootstrap loader, Revision 1.1
 Console internal video/keyboard
-(jkh at bento.freebsd.org, Mon Nov 20 11:41:23 GMT 2000)
-/kernel text=0x1234 data=0x2345 syms=[0x4+0x3456]
-Hit [Enter] to boot immediately, or any other key for command prompt
-Booting [kernel] in 9 seconds..._</screen></para></entry>
-	    <entry><para>loader</para></entry>
+(root at snap.freebsd.org, Thu Jan 16 22:18:05 UTC 2014)
+Loading /boot/defaults/loader.conf
+/boot/kernel/kernel text=0xed9008 data=0x117d28+0x176650 syms=[0x8+0x137988+0x8+0x1515f8]</screen></entry>
 	  </row>
 
 	  <row>
-	    <entry><para><screen>Copyright (c) 1992-2002 The FreeBSD Project.
+	    <entry>kernel</entry>
+	    <entry><screen>Copyright (c) 1992-2013 The FreeBSD Project.
 Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
         The Regents of the University of California. All rights reserved.
-FreeBSD 4.6-RC #0: Sat May  4 22:49:02 GMT 2002
-    devnull at kukas:/usr/obj/usr/src/sys/DEVNULL
-Timecounter "i8254"  frequency 1193182 Hz</screen></para></entry>
-	    <entry><para>kernel</para></entry>
+FreeBSD is a registered trademark of The FreeBSD Foundation.
+FreeBSD 10.0-RELEASE #0 r260789: Thu Jan 16 22:34:59 UTC 2014
+    root at snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64
+FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610</screen></entry>
 	  </row>
 	</tbody>
       </tgroup>
@@ -126,84 +155,114 @@ Timecounter "i8254"  frequency 1193182 H
   </sect1>
 
   <sect1 xml:id="boot-bios">
-    <title>BIOS POST</title>
+    <title>The <acronym>BIOS</acronym></title>
 
-    <para>When the PC powers on, the processor's registers are set
-      to some predefined values.  One of the registers is the
+    <para>When the computer powers on, the processor's registers are
+      set to some predefined values.  One of the registers is the
       <emphasis>instruction pointer</emphasis> register, and its value
       after a power on is well defined: it is a 32-bit value of
-      0xfffffff0.  The instruction pointer register points to code to
-      be executed by the processor.  One of the registers is the
+      <literal>0xfffffff0</literal>.  The instruction pointer register
+      (also known as the Program Counter) points to code to be
+      executed by the processor.  Another important register is the
       <literal>cr0</literal> 32-bit control register, and its value
-      just after the reboot is 0.  One of the cr0's bits, the bit PE
-      (Protection Enabled) indicates whether the processor is running
-      in protected or real mode.  Since at boot time this bit is
-      cleared, the processor boots in real mode.  Real mode means,
+      just after a reboot is <literal>0</literal>.  One of
+      <literal>cr0</literal>'s bits, the PE (Protection Enabled) bit,
+      indicates whether the processor is running in 32-bit protected
+      mode or 16-bit real mode.  Since this bit is cleared at boot
+      time, the processor boots in 16-bit real mode.  Real mode means,
       among other things, that linear and physical addresses are
-      identical.</para>
-
-    <para>The value of 0xfffffff0 is slightly less then 4Gb, so unless
-      the machine has 4Gb physical memory, it cannot point to a valid
-      memory address.  The computer's hardware translates this address
-      so that it points to a BIOS memory block.</para>
-
-    <para>BIOS stands for <emphasis>Basic Input Output
-	System</emphasis>, and it is a chip on the motherboard that
-      has a relatively small amount of read-only memory (ROM).  This
+      identical.  The reason for the processor not to start
+      immediately in 32-bit protected mode is backwards compatibility.
+      In particular, the boot process relies on the services provided
+      by the <acronym>BIOS</acronym>, and the <acronym>BIOS</acronym>
+      itself works in legacy, 16-bit code.</para>
+
+    <para>The value of <literal>0xfffffff0</literal> is slightly less
+      than 4 GB, so unless the machine has 4 GB of physical
+      memory, it cannot point to a valid memory address.  The
+      computer's hardware translates this address so that it points to
+      a <acronym>BIOS</acronym> memory block.</para>
+
+    <para>The <acronym>BIOS</acronym> (Basic Input Output
+      System) is a chip on the motherboard that has a relatively small
+      amount of read-only memory (<acronym>ROM</acronym>).  This
       memory contains various low-level routines that are specific to
-      the hardware supplied with the motherboard.  So, the processor
-      will first jump to the address 0xfffffff0, which really resides
-      in the BIOS's memory.  Usually this address contains a jump
-      instruction to the BIOS's POST routines.</para>
-
-    <para>POST stands for <emphasis>Power On Self Test</emphasis>.
-      This is a set of routines including the memory check, system bus
-      check and other low-level stuff so that the CPU can initialize
-      the computer properly.  The important step on this stage is
-      determining the boot device.  All modern BIOS's allow the boot
-      device to be set manually, so you can boot from a floppy,
-      CD-ROM, harddisk etc.</para>
-
-    <para>The very last thing in the POST is the <literal>INT
-	0x19</literal> instruction.  That instruction reads 512 bytes
-      from the first sector of boot device into the memory at address
-      0x7c00.  The term <emphasis>first sector</emphasis> originates
-      from harddrive architecture, where the magnetic plate is divided
-      to a number of cylindrical tracks.  Tracks are numbered, and
-      every track is divided by a number (usually 64) sectors.  Track
-      number 0 is the outermost on the magnetic plate, and sector 1,
-      the first sector (tracks, or, cylinders, are numbered starting
-      from 0, but sectors - starting from 1), has a special meaning.
-      It is also called Master Boot Record, or MBR.  The remaining
-      sectors on the first track are never used <footnote><para>Some
-	  utilities such as &man.disklabel.8; may store the
-	  information in this area, mostly in the second
-	  sector.</para></footnote>.</para>
+      the hardware supplied with the motherboard.  The processor will
+      first jump to the address 0xfffffff0, which really resides in
+      the <acronym>BIOS</acronym>'s memory.  Usually this address
+      contains a jump instruction to the <acronym>BIOS</acronym>'s
+      POST routines.</para>
+
+    <para>The <acronym>POST</acronym> (Power On Self Test)
+      is a set of routines including the memory check, system bus
+      check, and other low-level initialization so the
+      <acronym>CPU</acronym> can set up the computer properly.  The
+      important step of this stage is determining the boot device.
+      Modern <acronym>BIOS</acronym> implementations permit the
+      selection of a boot device, allowing booting from a floppy,
+      <acronym>CD-ROM</acronym>, hard disk, or other devices.</para>
+
+    <para>The very last thing in the <acronym>POST</acronym> is the
+      <literal>INT 0x19</literal> instruction.  The
+      <literal>INT 0x19</literal> handler reads 512 bytes from the
+      first sector of boot device into the memory at address
+      <literal>0x7c00</literal>.  The term
+      <emphasis>first sector</emphasis> originates from hard drive
+      architecture, where the magnetic plate is divided into a number
+      of cylindrical tracks.  Tracks are numbered, and every track is
+      divided into a number (usually 64) of sectors.  Track numbers
+      start at 0, but sector numbers start from 1. Track 0 is the
+      outermost on the magnetic plate, and sector 1, the first sector,
+      has a special purpose.  It is also called the
+      <acronym>MBR</acronym>, or Master Boot Record.  The remaining
+      sectors on the first track are never used.</para>
+
+    <para>This sector is our boot-sequence starting point.  As we will
+      see, this sector contains a copy of our
+      <filename>boot0</filename> program.  A jump is made by the
+      <acronym>BIOS</acronym> to address <literal>0x7c00</literal> so
+      it starts executing.</para>
   </sect1>
 
   <sect1 xml:id="boot-boot0">
-    <title><literal>boot0</literal> Stage</title>
+    <title>The Master Boot Record (<literal>boot0</literal>)</title>
 
     <indexterm><primary>MBR</primary></indexterm>
-    <para>Take a look at the file <filename>/boot/boot0</filename>.
-      This is a small 512-byte file, and it is exactly what FreeBSD's
-      installation procedure wrote to your harddisk's MBR if you chose
-      the <quote>bootmanager</quote> option at installation
-      time.</para>
+
+    <para>After control is received from the <acronym>BIOS</acronym>
+      at memory address <literal>0x7c00</literal>,
+      <filename>boot0</filename> starts executing.  It is the first
+      piece of code under &os; control.  The task of
+      <filename>boot0</filename> is quite simple: scan the partition
+      table and let the user choose which partition to boot from.  The
+      Partition Table is a special, standard data structure embedded
+      in the <acronym>MBR</acronym> (hence embedded in
+      <filename>boot0</filename>) describing the four standard PC
+      <quote>partitions</quote>
+      <footnote>
+	<para><link
+	  xlink:href="http://en.wikipedia.org/wiki/Master_boot_record"></link></para></footnote>.
+      <filename>boot0</filename> resides in the filesystem as
+      <filename>/boot/boot0</filename>.  It is a small 512-byte file,
+      and it is exactly what &os;'s installation procedure wrote to
+      the hard disk's <acronym>MBR</acronym> if you chose the <quote>bootmanager</quote>
+      option at installation time.  Indeed,
+      <filename>boot0</filename> <emphasis>is</emphasis> the
+      <acronym>MBR</acronym>.</para>
 
     <para>As mentioned previously, the <literal>INT 0x19</literal>
-      instruction loads an MBR, i.e., the <filename>boot0</filename>
-      content, into the memory at address 0x7c00.  Taking a look at
-      the file <filename>sys/boot/i386/boot0/boot0.S</filename> can
-      give a guess at what is happening there - this is the boot
-      manager, which is an awesome piece of code written by Robert
-      Nordier.</para>
-
-    <para>The MBR, or, <filename>boot0</filename>, has a special
-      structure starting from offset 0x1be, called the
-      <emphasis>partition table</emphasis>.  It has 4 records of 16
-      bytes each, called <emphasis>partition records</emphasis>, which
-      represent how the harddisk(s) are partitioned, or, in FreeBSD's
+      instruction causes the <literal>INT 0x19</literal> handler to
+      load an <acronym>MBR</acronym> (<filename>boot0</filename>) into
+      memory at address <literal>0x7c00</literal>.  The source file
+      for <filename>boot0</filename> can be found in
+      <filename>sys/boot/i386/boot0/boot0.S</filename> - which is an
+      awesome piece of code written by Robert Nordier.</para>
+
+    <para>A special structure starting from offset
+      <literal>0x1be</literal> in the <acronym>MBR</acronym> is called
+      the <emphasis>partition table</emphasis>.  It has four records
+      of 16 bytes each, called <emphasis>partition records</emphasis>,
+      which represent how the hard disk is partitioned, or, in &os;'s
       terminology, sliced.  One byte of those 16 says whether a
       partition (slice) is bootable or not.  Exactly one record must
       have that flag set, otherwise <filename>boot0</filename>'s code
@@ -229,186 +288,1471 @@ Timecounter "i8254"  frequency 1193182 H
       </listitem>
     </itemizedlist>
 
-    <para>A partition record descriptor has the information about
+    <para>A partition record descriptor contains information about
       where exactly the partition resides on the drive.  Both
-      descriptors, LBA and CHS, describe the same information, but in
-      different ways: LBA (Logical Block Addressing) has the starting
-      sector for the partition and the partition's length, while CHS
-      (Cylinder Head Sector) has coordinates for the first and last
-      sectors of the partition.</para>
-
-    <para>The boot manager scans the partition table and prints the
-      menu on the screen so the user can select what disk and what
-      slice to boot.  By pressing an appropriate key,
-      <filename>boot0</filename> performs the following
-      actions:</para>
+      descriptors, <acronym>LBA</acronym> and <acronym>CHS</acronym>,
+      describe the same information, but in different ways:
+      <acronym>LBA</acronym> (Logical Block Addressing) has the
+      starting sector for the partition and the partition's length,
+      while <acronym>CHS</acronym> (Cylinder Head Sector) has
+      coordinates for the first and last sectors of the partition.
+      The partition table ends with the special signature
+      <literal>0xaa55</literal>.</para>
+
+    <para>The <acronym>MBR</acronym> must fit into 512 bytes, a single
+      disk sector.  This program uses low-level <quote>tricks</quote>
+      like taking advantage of the side effects of certain
+      instructions and reusing register values from previous
+      operations to make the most out of the fewest possible
+      instructions.  Care must also be taken when handling the
+      partition table, which is embedded in the <acronym>MBR</acronym>
+      itself.  For these reasons, be very careful when modifying
+      <filename>boot0.S</filename>.</para>
+
+    <para>Note that the <filename>boot0.S</filename> source file
+      is assembled <quote>as is</quote>: instructions are translated
+      one by one to binary, with no additional information (no
+      <acronym>ELF</acronym> file format, for example).  This kind of
+      low-level control is achieved at link time through special
+      control flags passed to the linker.  For example, the text
+      section of the program is set to be located at address
+      <literal>0x600</literal>.  In practice this means that
+      <filename>boot0</filename> must be loaded to memory address
+      <literal>0x600</literal> in order to function properly.</para>
+
+    <para>It is worth looking at the <filename>Makefile</filename> for
+      <filename>boot0</filename>
+      (<filename>sys/boot/i386/boot0/Makefile</filename>), as it
+      defines some of the run-time behavior of
+      <filename>boot0</filename>.  For instance, if a terminal
+      connected to the serial port (COM1) is used for I/O, the macro
+      <literal>SIO</literal> must be defined
+      (<literal>-DSIO</literal>).  <literal>-DPXE</literal> enables
+      boot through <acronym>PXE</acronym> by pressing
+      <keycap>F6</keycap>.  Additionally, the program defines a set of
+      <emphasis>flags</emphasis> that allow further modification of
+      its behavior.  All of this is illustrated in the
+      <filename>Makefile</filename>.  For example, look at the
+      linker directives which command the linker to start the text
+      section at address <literal>0x600</literal>, and to build the
+      output file <quote>as is</quote> (strip out any file
+      formatting):</para>
+
+    <figure xml:id="boot-boot0-makefile-as-is">
+      <title><filename>sys/boot/i386/boot0/Makefile</filename></title>
+
+      <programlisting>      BOOT_BOOT0_ORG?=0x600
+      LDFLAGS=-e start -Ttext ${BOOT_BOOT0_ORG} \
+      -Wl,-N,-S,--oformat,binary</programlisting>
+    </figure>
+
+    <para>Let us now start our study of the <acronym>MBR</acronym>, or
+      <filename>boot0</filename>, starting where execution
+      begins.</para>
+
+    <note>
+      <para>Some modifications have been made to some instructions in
+	favor of better exposition.  For example, some macros are
+	expanded, and some macro tests are omitted when the result of
+	the test is known.  This applies to all of the code examples
+	shown.</para>
+    </note>
+
+    <figure xml:id="boot-boot0-entrypoint">
+      <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+      <programlisting>start:
+      cld			# String ops inc
+      xorw %ax,%ax		# Zero
+      movw %ax,%es		# Address
+      movw %ax,%ds		#  data
+      movw %ax,%ss		# Set up
+      movw 0x7c00,%sp		#  stack</programlisting>
+    </figure>
+
+    <para>This first block of code is the entry point of the program.
+      It is where the <acronym>BIOS</acronym> transfers control.
+      First, it makes sure that the string operations autoincrement
+      its pointer operands (the <literal>cld</literal> instruction)
+      <footnote>
+	<para>When in doubt, we refer the reader to the official Intel
+	  manuals, which describe the exact semantics for each
+	  instruction: <link
+	    xlink:href="http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html"></link>.</para></footnote>.
+      Then, as it makes no assumption about the state of the segment
+      registers, it initializes them.  Finally, it sets the stack
+      pointer register (<literal>%sp</literal>) to address
+      <literal>0x7c00</literal>, so we have a working stack.</para>
+
+    <para>The next block is responsible for the relocation and
+      subsequent jump to the relocated code.</para>
+
+    <figure xml:id="boot-boot0-relocation">
+      <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+      <programlisting>      movw $0x7c00,%si	# Source
+      movw $0x600,%di		# Destination
+      movw $512,%cx		# Word count
+      rep			# Relocate
+      movsb			#  code
+      movw %di,%bp		# Address variables
+      movb $16,%cl		# Words to clear
+      rep			# Zero
+      stosb			#  them
+      incb -0xe(%di)		# Set the S field to 1
+      jmp main-0x7c00+0x600	# Jump to relocated code</programlisting>
+    </figure>
+
+    <para>Because <filename>boot0</filename> is loaded by the
+      <acronym>BIOS</acronym> to address <literal>0x7C00</literal>, it
+      copies itself to address <literal>0x600</literal> and then
+      transfers control there (recall that it was linked to execute at
+      address <literal>0x600</literal>).  The source address,
+      <literal>0x7c00</literal>, is copied to register
+      <literal>%si</literal>.  The destination address,
+      <literal>0x600</literal>, to register <literal>%di</literal>.
+      The number of bytes to copy, <literal>512</literal> (the
+      program's size), is copied to register <literal>%cx</literal>.
+      Next, the <literal>rep</literal> instruction repeats the
+      instruction that follows, that is, <literal>movsb</literal>, the
+      number of times dictated by the <literal>%cx</literal> register.
+      The <literal>movsb</literal> instruction copies the byte pointed
+      to by <literal>%si</literal> to the address pointed to by
+      <literal>%di</literal>.  This is repeated another 511 times.  On
+      each repetition, both the source and destination registers,
+      <literal>%si</literal> and <literal>%di</literal>, are
+      incremented by one.  Thus, upon completion of the 512-byte copy,
+      <literal>%di</literal> has the value
+      <literal>0x600</literal>+<literal>512</literal>=
+      <literal>0x800</literal>, and <literal>%si</literal> has the
+      value <literal>0x7c00</literal>+<literal>512</literal>=
+      <literal>0x7e00</literal>; we have thus completed the code
+      <emphasis>relocation</emphasis>.</para>
+
+    <para>Next, the destination register
+      <literal>%di</literal> is copied to <literal>%bp</literal>.
+      <literal>%bp</literal> gets the value <literal>0x800</literal>.
+      The value <literal>16</literal> is copied to
+      <literal>%cl</literal> in preparation for a new string operation
+      (like our previous <literal>movsb</literal>).  Now,
+      <literal>stosb</literal> is executed 16 times.  This instruction
+      copies a <literal>0</literal> value to the address pointed to by
+      the destination register (<literal>%di</literal>, which is
+      <literal>0x800</literal>), and increments it.  This is repeated
+      another 15 times, so <literal>%di</literal> ends up with value
+      <literal>0x810</literal>.  Effectively, this clears the address
+      range <literal>0x800</literal>-<literal>0x80f</literal>.  This
+      range is used as a (fake) partition table for writing the
+      <acronym>MBR</acronym> back to disk.  Finally, the sector field
+      for the <acronym>CHS</acronym> addressing of this fake partition
+      is given the value 1 and a jump is made to the main function
+      from the relocated code.  Note that until this jump to the
+      relocated code, any reference to an absolute address was
+      avoided.</para>
+
+    <para>The following code block tests whether the drive number
+      provided by the <acronym>BIOS</acronym> should be used, or
+      the one stored in <filename>boot0</filename>.</para>
+
+    <figure xml:id="boot-boot0-drivenumber">
+      <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+      <programlisting>main:
+      testb $SETDRV,-69(%bp)	# Set drive number?
+      jnz disable_update	# Yes
+      testb %dl,%dl		# Drive number valid?
+      js save_curdrive		# Possibly (0x80 set)</programlisting>
+    </figure>
+
+    <para>This code tests the <literal>SETDRV</literal> bit
+      (<literal>0x20</literal>) in the <emphasis>flags</emphasis>
+      variable.  Recall that register <literal>%bp</literal> points to
+      address location <literal>0x800</literal>, so the test is done
+      to the <emphasis>flags</emphasis> variable at address
+      <literal>0x800</literal>-<literal>69</literal>=
+      <literal>0x7bb</literal>.  This is an example of the type of
+      modifications that can be done to <filename>boot0</filename>.
+      The <literal>SETDRV</literal> flag is not set by default, but it
+      can be set in the <filename>Makefile</filename>.  When set, the
+      drive number stored in the <acronym>MBR</acronym> is used
+      instead of the one provided by the <acronym>BIOS</acronym>.  We
+      assume the defaults, and that the <acronym>BIOS</acronym>
+      provided a valid drive number, so we jump to
+      <literal>save_curdrive</literal>.</para>
+
+    <para>The next block saves the drive number provided by the
+      <acronym>BIOS</acronym>, and calls <literal>putn</literal> to
+      print a new line on the screen.</para>
+
+    <figure xml:id="boot-boot0-savedrivenumber">
+      <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+      <programlisting>save_curdrive:
+      movb %dl, (%bp)		# Save drive number
+      pushw %dx			# Also in the stack
+#ifdef	TEST	/* test code, print internal bios drive */
+      rolb $1, %dl
+      movw $drive, %si
+      call putkey
+#endif
+      callw putn		# Print a newline</programlisting>
+    </figure>
+
+    <para>Note that we assume <varname>TEST</varname> is not defined,
+      so the conditional code in it is not assembled and will not
+      appear in our executable <filename>boot0</filename>.</para>
+
+    <para>Our next block implements the actual scanning of the
+      partition table.  It prints to the screen the partition type for
+      each of the four entries in the partition table.  It compares
+      each type with a list of well-known operating system file
+      systems.  Examples of recognized partition types are
+      <acronym>NTFS</acronym> (&windows;, ID 0x7),
+      <literal>ext2fs</literal> (&linux;, ID 0x83), and, of course,
+      <literal>ffs</literal>/<literal>ufs2</literal> (&os;, ID 0xa5).
+      The implementation is fairly simple.</para>
+
+    <figure xml:id="boot-boot0-partition-scan">
+      <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+      <programlisting>      movw $(partbl+0x4),%bx	# Partition table (+4)
+      xorw %dx,%dx		# Item number
+
+read_entry:
+      movb %ch,-0x4(%bx)	# Zero active flag (ch == 0)
+      btw %dx,_FLAGS(%bp)	# Entry enabled?
+      jnc next_entry		# No
+      movb (%bx),%al		# Load type
+      test %al, %al		# skip empty partition
+      jz next_entry
+      movw $bootable_ids,%di	# Lookup tables
+      movb $(TLEN+1),%cl	# Number of entries
+      repne			# Locate
+      scasb			#  type
+      addw $(TLEN-1), %di	# Adjust
+      movb (%di),%cl		# Partition
+      addw %cx,%di		#  description
+      callw putx		# Display it
+
+next_entry:
+      incw %dx			# Next item
+      addb $0x10,%bl		# Next entry
+      jnc read_entry		# Till done</programlisting>
+    </figure>
+
+    <para>It is important to note that the active flag for each entry
+      is cleared, so after the scanning, <emphasis>no</emphasis>
+      partition entry is active in our memory copy of
+      <filename>boot0</filename>.  Later, the active flag will be set
+      for the selected partition.  This ensures that only one active
+      partition exists if the user chooses to write the changes back
+      to disk.</para>
+
+    <para>The next block tests for other drives.  At startup,
+      the <acronym>BIOS</acronym> writes the number of drives present
+      in the computer to address <literal>0x475</literal>.  If there
+      are any other drives present, <filename>boot0</filename> prints
+      the current drive to screen.  The user may command
+      <filename>boot0</filename> to scan partitions on another drive
+      later.</para>
+
+    <figure xml:id="boot-boot0-test-drives">
+      <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+      <programlisting>      popw %ax			# Drive number
+      subb $0x79,%al		# Does next
+      cmpb 0x475,%al		#  drive exist? (from BIOS?)
+      jb print_drive		# Yes
+      decw %ax			# Already drive 0?
+      jz print_prompt		# Yes</programlisting>
+    </figure>
+
+    <para>We make the assumption that a single drive is present, so
+      the jump to <literal>print_drive</literal> is not performed.  We
+      also assume nothing strange happened, so we jump to
+      <literal>print_prompt</literal>.</para>
+
+    <para>This next block just prints out a prompt followed by the
+      default option:</para>
+
+    <figure xml:id="boot-boot0-prompt">
+      <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+      <programlisting>print_prompt:
+      movw $prompt,%si		# Display
+      callw putstr		#  prompt
+      movb _OPT(%bp),%dl	# Display
+      decw %si			#  default
+      callw putkey		#  key
+      jmp start_input		# Skip beep</programlisting>
+    </figure>
+
+    <para>Finally, a jump is performed to
+      <literal>start_input</literal>, where the
+      <acronym>BIOS</acronym> services are used to start a timer and
+      for reading user input from the keyboard; if the timer expires,
+      the default option will be selected:</para>
+
+    <figure xml:id="boot-boot0-start-input">
+      <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+      <programlisting>start_input:
+      xorb %ah,%ah		# BIOS: Get
+      int $0x1a			#  system time
+      movw %dx,%di		# Ticks when
+      addw _TICKS(%bp),%di	#  timeout
+read_key:
+      movb $0x1,%ah		# BIOS: Check
+      int $0x16			#  for keypress
+      jnz got_key		# Have input
+      xorb %ah,%ah		# BIOS: int 0x1a, 00
+      int $0x1a			#  get system time
+      cmpw %di,%dx		# Timeout?
+      jb read_key		# No</programlisting>
+    </figure>
+
+    <para>An interrupt is requested with number
+      <literal>0x1a</literal> and argument <literal>0</literal> in
+      register <literal>%ah</literal>.  The <acronym>BIOS</acronym>
+      has a predefined set of services, requested by applications as
+      software-generated interrupts through the <literal>int</literal>
+      instruction and receiving arguments in registers (in this case,
+      <literal>%ah</literal>).  Here, particularly, we are requesting
+      the number of clock ticks since last midnight; this value is
+      computed by the <acronym>BIOS</acronym> through the
+      <acronym>RTC</acronym> (Real Time Clock).  This clock can be
+      programmed to work at frequencies ranging from 2 Hz to
+      8192 Hz.  The <acronym>BIOS</acronym> sets it to
+      18.2 Hz at startup.  When the request is satisfied, a
+      32-bit result is returned by the <acronym>BIOS</acronym> in
+      registers <literal>%cx</literal> and <literal>%dx</literal>
+      (lower bytes in <literal>%dx</literal>).  This result (the
+      <literal>%dx</literal> part) is copied to register
+      <literal>%di</literal>, and the value of the
+      <varname>TICKS</varname> variable is added to
+      <literal>%di</literal>.  This variable resides in
+      <filename>boot0</filename> at offset <literal>_TICKS</literal>
+      (a negative value) from register <literal>%bp</literal> (which,
+      recall, points to <literal>0x800</literal>).  The default value
+      of this variable is <literal>0xb6</literal> (182 in decimal).
+      Now, the idea is that <filename>boot0</filename> constantly
+      requests the time from the <acronym>BIOS</acronym>, and when the
+      value returned in register <literal>%dx</literal> is greater
+      than the value stored in <literal>%di</literal>, the time is up
+      and the default selection will be made.  Since the RTC ticks
+      18.2 times per second, this condition will be met after 10
+      seconds (this default behaviour can be changed in the
+      <filename>Makefile</filename>).  Until this time has passed,
+      <filename>boot0</filename> continually asks the
+      <acronym>BIOS</acronym> for any user input; this is done through
+      <literal>int 0x16</literal>, argument <literal>1</literal> in
+      <literal>%ah</literal>.</para>
+
+    <para>Whether a key was pressed or the time expired, subsequent
+      code validates the selection.  Based on the selection, the
+      register <literal>%si</literal> is set to point to the
+      appropriate partition entry in the partition table.  This new
+      selection overrides the previous default one.  Indeed, it
+      becomes the new default.  Finally, the ACTIVE flag of the
+      selected partition is set.  If it was enabled at compile time,
+      the in-memory version of <filename>boot0</filename> with these
+      modified values is written back to the <acronym>MBR</acronym> on
+      disk.  We leave the details of this implementation to the
+      reader.</para>
+
+    <para>We now end our study with the last code block from the
+      <filename>boot0</filename> program:</para>
+
+    <figure xml:id="boot-boot0-check-bootable">
+      <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+      <programlisting>      movw $0x7c00,%bx		# Address for read
+      movb $0x2,%ah		# Read sector
+      callw intx13		#  from disk
+      jc beep			# If error
+      cmpw $0xaa55,0x1fe(%bx)	# Bootable?
+      jne beep			# No
+      pushw %si			# Save ptr to selected part.
+      callw putn		# Leave some space
+      popw %si			# Restore, next stage uses it
+      jmp *%bx			# Invoke bootstrap</programlisting>
+    </figure>
+
+    <para>Recall that <literal>%si</literal> points to the selected
+      partition entry.  This entry tells us where the partition begins
+      on disk.  We assume, of course, that the partition selected is
+      actually a &os; slice.</para>
+
+    <note>
+      <para>From now on, we will favor the use of the technically
+	more accurate term <quote>slice</quote> rather than
+	<quote>partition</quote>.</para>
+    </note>
+
+    <para>The transfer buffer is set to <literal>0x7c00</literal>
+      (register <literal>%bx</literal>), and a read for the first
+      sector of the &os; slice is requested by calling
+      <literal>intx13</literal>.  We assume that everything went okay,
+      so a jump to <literal>beep</literal> is not performed.  In
+      particular, the new sector read must end with the magic sequence
+      <literal>0xaa55</literal>.  Finally, the value at
+      <literal>%si</literal> (the pointer to the selected partition
+      table) is preserved for use by the next stage, and a jump is
+      performed to address <literal>0x7c00</literal>, where execution
+      of our next stage (the just-read block) is started.</para>
+  </sect1>
+
+  <sect1 xml:id="boot-boot1">
+    <title><literal>boot1</literal> Stage</title>
+
+    <para>So far we have gone through the following sequence:</para>
 
     <itemizedlist>
       <listitem>
-	<para>modifies the bootable flag for the selected partition to
-	  make it bootable, and clears the previous</para>
+	<para>The <acronym>BIOS</acronym> did some early hardware
+	  initialization, including the <acronym>POST</acronym>.  The
+	  <acronym>MBR</acronym> (<filename>boot0</filename>) was
+	  loaded from absolute disk sector one to address
+	  <literal>0x7c00</literal>.  Execution control was passed to
+	  that location.</para>
       </listitem>
 
       <listitem>
-	<para>saves itself to disk to remember what partition (slice)
-	  has been selected so to use it as the default on the next
-	  boot</para>
+	<para><filename>boot0</filename> relocated itself to the
+	  location it was linked to execute
+	  (<literal>0x600</literal>), followed by a jump to continue
+	  execution at the appropriate place.  Finally,
+	  <filename>boot0</filename> loaded the first disk sector from
+	  the &os; slice to address <literal>0x7c00</literal>.
+	  Execution control was passed to that location.</para>
       </listitem>
+    </itemizedlist>
+
+    <para><filename>boot1</filename> is the next step in the
+      boot-loading sequence.  It is the first of three boot stages.
+      Note that we have been dealing exclusively
+      with disk sectors.  Indeed, the <acronym>BIOS</acronym> loads
+      the absolute first sector, while <filename>boot0</filename>
+      loads the first sector of the &os; slice.  Both loads are to
+      address <literal>0x7c00</literal>.  We can conceptually think of
+      these disk sectors as containing the files
+      <filename>boot0</filename> and <filename>boot1</filename>,
+      respectively, but in reality this is not entirely true for
+      <filename>boot1</filename>.  Strictly speaking, unlike
+      <filename>boot0</filename>, <filename>boot1</filename> is not
+      part of the boot blocks
+      <footnote>
+	<para>There is a file <filename>/boot/boot1</filename>, but it
+	  is not the written to the beginning of the &os; slice.
+	  Instead, it is concatenated with <filename>boot2</filename>
+	  to form <filename>boot</filename>, which
+	  <emphasis>is</emphasis> written to the beginning of the &os;
+	  slice and read at boot time.</para></footnote>.
+      Instead, a single, full-blown file, <filename>boot</filename>
+      (<filename>/boot/boot</filename>), is what ultimately is
+      written to disk.  This file is a combination of
+      <filename>boot1</filename>, <filename>boot2</filename> and the
+      <literal>Boot Extender</literal> (or <acronym>BTX</acronym>).
+      This single file is greater in size than a single sector
+      (greater than 512 bytes).  Fortunately,
+      <filename>boot1</filename> occupies <emphasis>exactly</emphasis>
+      the first 512 bytes of this single file, so when
+      <filename>boot0</filename> loads the first sector of the &os;
+      slice (512 bytes), it is actually loading
+      <filename>boot1</filename> and transferring control to
+      it.</para>
+
+    <para>The main task of <filename>boot1</filename> is to load the
+      next boot stage.  This next stage is somewhat more complex.  It
+      is composed of a server called the <quote>Boot Extender</quote>,
+      or <acronym>BTX</acronym>, and a client, called
+      <filename>boot2</filename>.  As we will see, the last boot
+      stage, <filename>loader</filename>, is also a client of the
+      <acronym>BTX</acronym> server.</para>
+
+    <para>Let us now look in detail at what exactly is done by
+      <filename>boot1</filename>, starting like we did for
+      <filename>boot0</filename>, at its entry point:</para>
+
+    <figure xml:id="boot-boot1-entry">
+      <title><filename>sys/boot/i386/boot2/boot1.S</filename></title>
+
+      <programlisting>start:
+	jmp main</programlisting>
+    </figure>
+
+    <para>The entry point at <literal>start</literal> simply jumps
+      past a special data area to the label <literal>main</literal>,
+      which in turn looks like this:</para>
+
+    <figure xml:id="boot-boot1-main">
+      <title><filename>sys/boot/i386/boot2/boot1.S</filename></title>
+
+      <programlisting>main:
+      cld			# String ops inc
+      xor %cx,%cx		# Zero
+      mov %cx,%es		# Address
+      mov %cx,%ds		#  data
+      mov %cx,%ss		# Set up
+      mov $start,%sp		#  stack
+      mov %sp,%si		# Source
+      mov $0x700,%di		# Destination
+      incb %ch			# Word count
+      rep			# Copy
+      movsw			#  code</programlisting>
+    </figure>
+
+    <para>Just like <filename>boot0</filename>, this
+      code relocates <filename>boot1</filename>,
+      this time to memory address <literal>0x700</literal>.  However,
+      unlike <filename>boot0</filename>, it does not jump there.
+      <filename>boot1</filename> is linked to execute at
+      address <literal>0x7c00</literal>, effectively where it was
+      loaded in the first place.  The reason for this relocation will
+      be discussed shortly.</para>
+
+    <para>Next comes a loop that looks for the &os; slice.  Although
+      <filename>boot0</filename> loaded <filename>boot1</filename>
+      from the &os; slice, no information was passed to it about this
+      <footnote>
+	<para>Actually we did pass a pointer to the slice entry in
+	  register <literal>%si</literal>.  However,
+	  <filename>boot1</filename> does not assume that it was
+	  loaded by <filename>boot0</filename> (perhaps some other
+	  <acronym>MBR</acronym> loaded it, and did not pass this
+	  information), so it assumes nothing.</para></footnote>,
+      so <filename>boot1</filename> must rescan the
+      partition table to find where the &os; slice starts.  Therefore
+      it rereads the <acronym>MBR</acronym>:</para>
+
+    <figure xml:id="boot-boot1-find-freebsd">
+      <title><filename>sys/boot/i386/boot2/boot1.S</filename></title>
+
+      <programlisting>      mov $part4,%si		# Partition
+      cmpb $0x80,%dl		# Hard drive?
+      jb main.4			# No
+      movb $0x1,%dh		# Block count
+      callw nread		# Read MBR</programlisting>
+    </figure>
+
+    <para>In the code above, register <literal>%dl</literal>
+      maintains information about the boot device.  This is passed on
+      by the <acronym>BIOS</acronym> and preserved by the
+      <acronym>MBR</acronym>.  Numbers <literal>0x80</literal> and
+      greater tells us that we are dealing with a hard drive, so a
+      call is made to <literal>nread</literal>, where the
+      <acronym>MBR</acronym> is read.  Arguments to
+      <literal>nread</literal> are passed through
+      <literal>%si</literal> and <literal>%dh</literal>.  The memory
+      address at label <literal>part4</literal> is copied to
+      <literal>%si</literal>.  This memory address holds a
+      <quote>fake partition</quote> to be used by
+      <literal>nread</literal>.  The following is the data in the fake
+      partition:</para>
+
+    <figure xml:id="boot-boot2-make-fake-partition">
+      <title><filename>sys/boot/i386/boot2/Makefile</filename></title>
+
+      <programlisting>      part4:
+	.byte 0x80, 0x00, 0x01, 0x00
+	.byte 0xa5, 0xfe, 0xff, 0xff
+	.byte 0x00, 0x00, 0x00, 0x00
+	.byte 0x50, 0xc3, 0x00, 0x00</programlisting>
+    </figure>
+
+    <para>In particular, the <acronym>LBA</acronym> for this fake
+      partition is hardcoded to zero.  This is used as an argument to
+      the <acronym>BIOS</acronym> for reading absolute sector one from
+      the hard drive.  Alternatively, CHS addressing could be used.
+      In this case, the fake partition holds cylinder 0, head 0 and
+      sector 1, which is equivalent to absolute sector one.</para>
+
+    <para>Let us now proceed to take a look at
+      <literal>nread</literal>:</para>
+
+    <figure xml:id="boot-boot1-nread">
+      <title><filename>sys/boot/i386/boot2/boot1.S</filename></title>
+
+      <programlisting>nread:
+      mov $0x8c00,%bx		# Transfer buffer
+      mov 0x8(%si),%ax		# Get
+      mov 0xa(%si),%cx		#  LBA
+      push %cs			# Read from
+      callw xread.1		#  disk
+      jnc return		# If success, return</programlisting>
+    </figure>
+
+    <para>Recall that <literal>%si</literal> points to the fake
+      partition.  The word
+      <footnote>
+	<para>In the context of 16-bit real mode, a word is 2
+	  bytes.</para></footnote>
+      at offset <literal>0x8</literal> is copied to register
+      <literal>%ax</literal> and word at offset <literal>0xa</literal>
+      to <literal>%cx</literal>.  They are interpreted by the
+      <acronym>BIOS</acronym> as the lower 4-byte value denoting the
+      LBA to be read (the upper four bytes are assumed to be zero).
+      Register <literal>%bx</literal> holds the memory address where
+      the <acronym>MBR</acronym> will be loaded.  The instruction
+      pushing <literal>%cs</literal> onto the stack is very
+      interesting.  In this context, it accomplishes nothing.  However, as
+      we will see shortly, <filename>boot2</filename>, in conjunction
+      with the <acronym>BTX</acronym> server, also uses
+      <literal>xread.1</literal>.  This mechanism will be discussed in
+      the next section.</para>
+
+    <para>The code at <literal>xread.1</literal> further calls
*** DIFF OUTPUT TRUNCATED AT 1000 LINES ***
    
    
More information about the svn-doc-all
mailing list