From nobody Mon Apr 18 14:37:21 2022 X-Original-To: dev-commits-doc-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 7184811D7AEA for ; Mon, 18 Apr 2022 14:37:36 +0000 (UTC) (envelope-from carlavilla@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4KhqKD2KfQz4c2l; Mon, 18 Apr 2022 14:37:36 +0000 (UTC) (envelope-from carlavilla@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1650292656; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gdp+UC8Bjr+huT0Va2EQi7oWIRFuJJsbNkPCscRXa2A=; b=v1LLYOBfno8IbXBccXIY6m3MAbc2AvKnXYgBebAaAfP6chu9ItYmdZqTUZHhZi7ZBoCwMG adQ7msmG+DiCkZfPimUOkQMBoy9k8E6InA1imeGVQz7HX+EZQQM0K11LYr0nBoCns5Tj2k SDZsQuwK5uX4enJhHDwb5J0saDrOwlAcfUnC/1g7WROuz94XJyPpWO4Ph1ZOAouTu4XOkO S+cHLWmIuWzDE9I3+emSxLjS+x0Mvrpa0T12qi3Il8BcLZGw8z6JFkDqMnTjP0Ckz6pWYD +BVWapEqqzF/A6Cb1iG7tPLTUJb8CV4gJdai3LvQ9Q2EdqBo1OLcD2oJQgWamQ== Received: from mail-ua1-f53.google.com (mail-ua1-f53.google.com [209.85.222.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) (Authenticated sender: carlavilla) by smtp.freebsd.org (Postfix) with ESMTPSA id 1C48E2CF92; Mon, 18 Apr 2022 14:37:36 +0000 (UTC) (envelope-from carlavilla@freebsd.org) Received: by mail-ua1-f53.google.com with SMTP id b11so4825270uaq.2; Mon, 18 Apr 2022 07:37:36 -0700 (PDT) X-Gm-Message-State: AOAM531z5ydPRQTWdexP68mirHcwL9h6fV7Q3gYZw6/eyMC+Neiu2zZi Q5xi+l9GEJ7jDHYHCY7QGZ15yAHImlphFNBDepM= X-Google-Smtp-Source: ABdhPJxBVzlYNmWcKgM6+mZRcYZPVnJVb8aucj9xLKT4KjEx7MplQnq4aZuUXjV5aQWevp+ySm4RaLvjgqi+wNyZkkA= X-Received: by 2002:ab0:60ac:0:b0:35d:2971:6e9a with SMTP id f12-20020ab060ac000000b0035d29716e9amr2570499uam.74.1650292653948; Mon, 18 Apr 2022 07:37:33 -0700 (PDT) List-Id: Commit messages for all branches of the doc repository List-Archive: https://lists.freebsd.org/archives/dev-commits-doc-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-doc-all@freebsd.org X-BeenThere: dev-commits-doc-all@freebsd.org MIME-Version: 1.0 References: <202204181433.23IEXIQk023321@gitrepo.freebsd.org> In-Reply-To: <202204181433.23IEXIQk023321@gitrepo.freebsd.org> From: Sergio Carlavilla Date: Mon, 18 Apr 2022 16:37:21 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: git: 954bbbabe3 - main - arch-handbook: Update boot chapter To: Daniel Ebdrup Jensen Cc: doc-committers@freebsd.org, dev-commits-doc-all@freebsd.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1650292656; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gdp+UC8Bjr+huT0Va2EQi7oWIRFuJJsbNkPCscRXa2A=; b=VXA3jBJ7y3kHhg+56sAYbPMTlTxQPBrKowORHtlr63N9tqGybTrXpTUKL8MuS43dWKiQDu syuE9p/RBpMlLrBpeE/Y4TVEyFzmEbBDQMkftMlZZVxJv3mtZJ/zjUS8k2uIpur/z9yvhK +ACP96hdumrZXezOEqVeeKTvlxPdthN6jsBtm3hgk+dftQ1ZrtmnBm359R6I8xszr7/Wkz gnZx1UIKTQGTUO6sKP4VMf/ZoySLX2pvuKNnn6g84Czr+9KLLpohlXcI7I9XQk7ezlWWsP pjk3ZsxqkuXKxo6XS4HKc+d7T3HQcDpi0vqILFo6mKPY95EsQVDHSUnWuX2LQw== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1650292656; a=rsa-sha256; cv=none; b=wZoEWZxKOR6f+ddt3YtffkRDAgEsLouEdyzTU0YctzWGSzYhigOJGtAh0821yA4+ILgRE+ kpvA4+kTkBGUE/y/6CJTdpiizk16+6wNH5d5tH1jDqMipjr5YbjepMUlIGItbG1IG32TEw 05yUEJT7h2uxZRZbnIHsJPNN0nPLIzUR0rzNuoXAt/PUK9vmYYivv1VuPS+23xSOM2+IA4 f4zjbdrLyhGOq2eZWN9FN9QiOQIRHgUk0G+zgWQp10IYuf9O7vLBTWHkTlzJm1T8BCp+bL N7gSyUzfe1DAMLgWNy4EDRIo7ixLTJVjhpxNxMrIp3TZ38jOX+RmAtrqb+beXQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N On Mon, 18 Apr 2022 at 16:33, Daniel Ebdrup Jensen wr= ote: > > The branch main has been updated by debdrup: > > URL: https://cgit.FreeBSD.org/doc/commit/?id=3D954bbbabe38e5dddddeee2774f= 4330f99b62d912 > > commit 954bbbabe38e5dddddeee2774f4330f99b62d912 > Author: Isa > AuthorDate: 2022-04-03 21:29:27 +0000 > Commit: Daniel Ebdrup Jensen > CommitDate: 2022-04-18 09:17:23 +0000 > > arch-handbook: Update boot chapter > > A lot has changed in the code since RELEASE 10.0 in 2014, when this > document last received a major content change. > > One significant change is in the path to the boot folder, ie > src/sys/boot has become src/stand/. > > Another change is that various code blocks have had their sample text= s > updated, such as the dmesg now looking like it does on a new install. > > Similarly, the assembly code has been updated with the relevant secti= ons > from the source tree. The spacing has been changed to be maximally > compatible with the original version. > > Reviewed by: imp (src), Pau Amma > Pull Request: https://github.com/freebsd/freebsd-doc/pull/60 > --- > .../en/books/arch-handbook/boot/_index.adoc | 466 ++++++++++-----= ------ > 1 file changed, 233 insertions(+), 233 deletions(-) > > diff --git a/documentation/content/en/books/arch-handbook/boot/_index.ado= c b/documentation/content/en/books/arch-handbook/boot/_index.adoc > index ebed0609ca..c280b5fe12 100644 > --- a/documentation/content/en/books/arch-handbook/boot/_index.adoc > +++ b/documentation/content/en/books/arch-handbook/boot/_index.adoc > @@ -50,14 +50,14 @@ endif::[] > [[boot-synopsis]] > =3D=3D Synopsis > > -This chapter is an overview of the boot and system initialization proces= ses, starting from the BIOS (firmware) POST, to the first user process crea= tion. Since the initial steps of system startup are very architecture depen= dent, the IA-32 architecture is used as an example. > +This chapter is an overview of the boot and system initialization proces= ses, starting from the BIOS (firmware) POST, to the first user process crea= tion. Since the initial steps of system startup are very architecture depen= dent, the IA-32 architecture is used as an example. But the AMD64 and ARM64= architectures are much more important and compelling examples and should b= e explained in the near future according to the topic of this document. > > The FreeBSD boot process can be surprisingly complex. After control is p= assed from the BIOS, a considerable amount of low-level configuration must = be done before the kernel can be loaded and executed. This setup must be do= ne in a simple and flexible manner, allowing the user a great deal of custo= mization possibilities. > > [[boot-overview]] > =3D=3D Overview > > -The boot process is an extremely machine-dependent activity. Not only mu= st code be written for every computer architecture, but there may also be m= ultiple types of booting on the same architecture. For example, a directory= listing of [.filename]#/usr/src/sys/boot# reveals a great amount of archit= ecture-dependent code. There is a directory for each of the various support= ed architectures. In the x86-specific [.filename]#i386# directory, there ar= e subdirectories for different boot standards like [.filename]#mbr# (Master= Boot Record), [.filename]#gpt# (GUID Partition Table), and [.filename]#efi= # (Extensible Firmware Interface). Each boot standard has its own conventio= ns and data structures. The example that follows shows booting an x86 compu= ter from an MBR hard drive with the FreeBSD [.filename]#boot0# multi-boot l= oader stored in the very first sector. That boot code starts the FreeBSD th= ree-stage boot process. > +The boot process is an extremely machine-dependent activity. Not only mu= st code be written for every computer architecture, but there may also be m= ultiple types of booting on the same architecture. For example, a directory= listing of [.filename]#stand# reveals a great amount of architecture-depen= dent code. There is a directory for each of the various supported architect= ures. FreeBSD supports the CSM boot standard (Compatibility Support Module)= . So CSM is supported (with both GPT and MBR partitioning support) and UEFI= booting (GPT is totally supported, MBR is mostly supported). It also suppo= rts loading files from ext2fs, MSDOS, UFS and ZFS. FreeBSD also supports th= e boot environment feature of ZFS which allows the HOST OS to communicate d= etails about what to boot that go beyond a simple partition as was possible= in the past. But UEFI is more relevant than the CMS these days. The exampl= e that follows shows booting an x86 computer from an MBR-partitioned hard d= rive with the FreeBSD [.f > ilename]#boot0# multi-boot loader stored in the very first sector. That = boot code starts the FreeBSD three-stage boot process. > > The key to understanding this process is that it is a series of stages o= f increasing complexity. These stages are [.filename]#boot1#, [.filename]#b= oot2#, and [.filename]#loader# (see man:boot[8] for more detail). The boot = system executes each stage in sequence. The last stage, [.filename]#loader#= , is responsible for loading the FreeBSD kernel. Each stage is examined in = the following sections. > > @@ -85,8 +85,8 @@ a| > > [source,bash] > .... > ->>FreeBSD/i386 BOOT > -Default: 1:ad(1,a)/boot/loader > +>>FreeBSD/x86 BOOT > +Default: 0:ad(0p4)/boot/loader > boot: > .... > > @@ -102,7 +102,7 @@ BIOS 639kB/2096064kB available memory > > FreeBSD/x86 bootstrap loader, Revision 1.1 > Console internal video/keyboard > -(root@snap.freebsd.org, Thu Jan 16 22:18:05 UTC 2014) > +(root@releng1.nyi.freebsd.org, Fri Apr 9 04:04:45 UTC 2021) > Loading /boot/defaults/loader.conf > /boot/kernel/kernel text=3D0xed9008 data=3D0x117d28+0x176650 syms=3D[0x8= +0x137988+0x8+0x1515f8] > .... > @@ -112,13 +112,13 @@ a| > > [source,bash] > .... > -Copyright (c) 1992-2013 The FreeBSD Project. > +Copyright (c) 1992-2021 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved= . > FreeBSD is a registered trademark of The FreeBSD Foundation. > -FreeBSD 10.0-RELEASE 0 r260789: Thu Jan 16 22:34:59 UTC 2014 > - root@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 > -FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610 > +FreeBSD 13.0-RELEASE 0 releng/13.0-n244733-ea31abc261f: Fri Apr 9 04:04= :45 UTC 2021 > + root@releng1.nyi.freebsd.org:/usr/obj/usr/src/i386.i386/sys/GENERIC = i386 > +FreeBSD clang version 11.0.1 (git@github.com:llvm/llvm-project.git llvmo= rg-11.0.1-0-g43ff75f2c3fe) > .... > > |=3D=3D=3D > @@ -143,7 +143,7 @@ This sector is our boot-sequence starting point. As w= e will see, this sector con > > After control is received from the BIOS at memory address `0x7c00`, [.fi= lename]#boot0# starts executing. It is the first piece of code under FreeBS= D control. The task of [.filename]#boot0# is quite simple: scan the partiti= on table and let the user choose which partition to boot from. The Partitio= n Table is a special, standard data structure embedded in the MBR (hence em= bedded in [.filename]#boot0#) describing the four standard PC "partitions".= [.filename]#boot0# resides in the filesystem as [.filename]#/boot/boot0#. = It is a small 512-byte file, and it is exactly what FreeBSD's installation = procedure wrote to the hard disk's MBR if you chose the "bootmanager" optio= n at installation time. Indeed, [.filename]#boot0#_is_ the MBR. > > -As mentioned previously, the `INT 0x19` instruction causes the `INT 0x19= ` handler to load an MBR ([.filename]#boot0#) into memory at address `0x7c0= 0`. The source file for [.filename]#boot0# can be found in [.filename]#sys/= boot/i386/boot0/boot0.S# - which is an awesome piece of code written by Rob= ert Nordier. > +As mentioned previously, we're calling the BIOS `INT 0x19` to load the M= BR ([.filename]#boot0#) into memory at address `0x7c00`. The source file fo= r [.filename]#boot0# can be found in [.filename]#stand/i386/boot0/boot0.S# = - which is an awesome piece of code written by Robert Nordier. > > A special structure starting from offset `0x1be` in the MBR is called th= e _partition table_. It has four records of 16 bytes each, called _partitio= n records_, which represent how the hard disk is partitioned, or, in FreeBS= D's terminology, sliced. One byte of those 16 says whether a partition (sli= ce) is bootable or not. Exactly one record must have that flag set, otherwi= se [.filename]#boot0#'s code will refuse to proceed. > > @@ -160,16 +160,15 @@ The MBR must fit into 512 bytes, a single disk sect= or. This program uses low-lev > > Note that the [.filename]#boot0.S# source file is assembled "as is": ins= tructions are translated one by one to binary, with no additional informati= on (no ELF file format, for example). This kind of low-level control is ach= ieved at link time through special control flags passed to the linker. For = example, the text section of the program is set to be located at address `0= x600`. In practice this means that [.filename]#boot0# must be loaded to mem= ory address `0x600` in order to function properly. > > -It is worth looking at the [.filename]#Makefile# for [.filename]#boot0# = ([.filename]#sys/boot/i386/boot0/Makefile#), as it defines some of the run-= time behavior of [.filename]#boot0#. For instance, if a terminal connected = to the serial port (COM1) is used for I/O, the macro `SIO` must be defined = (`-DSIO`). `-DPXE` enables boot through PXE by pressing kbd:[F6]. Additiona= lly, the program defines a set of _flags_ that allow further modification o= f its behavior. All of this is illustrated in the [.filename]#Makefile#. Fo= r example, look at the linker directives which command the linker to start = the text section at address `0x600`, and to build the output file "as is" (= strip out any file formatting): > +It is worth looking at the [.filename]#Makefile# for [.filename]#boot0# = ([.filename]#stand/i386/boot0/Makefile#), as it defines some of the run-tim= e behavior of [.filename]#boot0#. For instance, if a terminal connected to = the serial port (COM1) is used for I/O, the macro `SIO` must be defined (`-= DSIO`). `-DPXE` enables boot through PXE by pressing kbd:[F6]. Additionally= , the program defines a set of _flags_ that allow further modification of i= ts behavior. All of this is illustrated in the [.filename]#Makefile#. For e= xample, look at the linker directives which command the linker to start the= text section at address `0x600`, and to build the output file "as is" (str= ip out any file formatting): > > [.programlisting] > .... > BOOT_BOOT0_ORG?=3D0x600 > - LDFLAGS=3D-e start -Ttext ${BOOT_BOOT0_ORG} \ > - -Wl,-N,-S,--oformat,binary > + ORG=3D${BOOT_BOOT0_ORG} > .... > > -.[.filename]#sys/boot/i386/boot0/Makefile# [[boot-boot0-makefile-as-is]] > +.[.filename]#stand/i386/boot0/Makefile# [[boot-boot0-makefile-as-is]] > Let us now start our study of the MBR, or [.filename]#boot0#, starting w= here execution begins. > > [NOTE] > @@ -185,46 +184,50 @@ start: > movw %ax,%es # Address > movw %ax,%ds # data > movw %ax,%ss # Set up > - movw 0x7c00,%sp # stack > + movw $LOAD,%sp # stack > .... > > -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-entrypoint]] > -This first block of code is the entry point of the program. It is where = the BIOS transfers control. First, it makes sure that the string operations= autoincrement its pointer operands (the `cld` instruction) footnote:[When = in doubt, we refer the reader to the official Intel manuals, which describe= the exact semantics for each instruction: .]. Then, as it makes no assumpt= ion about the state of the segment registers, it initializes them. Finally,= it sets the stack pointer register (`%sp`) to address `0x7c00`, so we have= a working stack. > +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-entrypoint]] > +This first block of code is the entry point of the program. It is where = the BIOS transfers control. First, it makes sure that the string operations= autoincrement its pointer operands (the `cld` instruction) footnote:[When = in doubt, we refer the reader to the official Intel manuals, which describe= the exact semantics for each instruction: .]. Then, as it makes no assumpt= ion about the state of the segment registers, it initializes them. Finally,= it sets the stack pointer register (`%sp`) to ($LOAD =3D address `0x7c00`)= , so we have a working stack. > > The next block is responsible for the relocation and subsequent jump to = the relocated code. > > [.programlisting] > .... > - movw $0x7c00,%si # Source > - movw $0x600,%di # Destination > - movw $512,%cx # Word count > + movw %sp,%si # Source > + movw $start,%di # Destination > + movw $0x100,%cx # Word count > rep # Relocate > - movsb # code > + movsw # code > movw %di,%bp # Address variables > - movb $16,%cl # Words to clear > + movb $0x8,%cl # Words to clear > rep # Zero > - stosb # them > + stosw # them > incb -0xe(%di) # Set the S field to 1 > - jmp main-0x7c00+0x600 # Jump to relocated code > + jmp main-LOAD+ORIGIN # Jump to relocated code > .... > > -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-relocation]] > -As [.filename]#boot0# is loaded by the BIOS to address `0x7C00`, it copi= es itself to address `0x600` and then transfers control there (recall that = it was linked to execute at address `0x600`). The source address, `0x7c00`,= is copied to register `%si`. The destination address, `0x600`, to register= `%di`. The number of bytes to copy, `512` (the program's size), is copied = to register `%cx`. Next, the `rep` instruction repeats the instruction that= follows, that is, `movsb`, the number of times dictated by the `%cx` regis= ter. The `movsb` instruction copies the byte pointed to by `%si` to the add= ress pointed to by `%di`. This is repeated another 511 times. On each repet= ition, both the source and destination registers, `%si` and `%di`, are incr= emented by one. Thus, upon completion of the 512-byte copy, `%di` has the v= alue `0x600`+`512`=3D `0x800`, and `%si` has the value `0x7c00`+`512`=3D `0= x7e00`; we have thus completed the code _relocation_. > +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-relocation]] > +As [.filename]#boot0# is loaded by the BIOS to address `0x7C00`, it copi= es itself to address `0x600` and then transfers control there (recall that = it was linked to execute at address `0x600`). The source address, `0x7c00`,= is copied to register `%si`. The destination address, `0x600`, to register= `%di`. The number of words to copy, `256` (the program's size =3D 512 byte= s), is copied to register `%cx`. Next, the `rep` instruction repeats the in= struction that follows, that is, `movsw`, the number of times dictated by t= he `%cx` register. The `movsw` instruction copies the word pointed to by `%= si` to the address pointed to by `%di`. This is repeated another 255 times.= On each repetition, both the source and destination registers, `%si` and `= %di`, are incremented by one. Thus, upon completion of the 256-word (512-by= te) copy, `%di` has the value `0x600`+`512`=3D `0x800`, and `%si` has the v= alue `0x7c00`+`512`=3D `0x7e00`; we have thus completed the code _relocatio= n_. Since the last update of th > is document, the copy instructions have changed in the code, so instead = of the movsb and stosb, movsw and stosw have been introduced, which copy 2 = bytes(1 word) in one iteration. > > -Next, the destination register `%di` is copied to `%bp`. `%bp` gets the = value `0x800`. The value `16` is copied to `%cl` in preparation for a new s= tring operation (like our previous `movsb`). Now, `stosb` is executed 16 ti= mes. This instruction copies a `0` value to the address pointed to by the d= estination register (`%di`, which is `0x800`), and increments it. This is r= epeated another 15 times, so `%di` ends up with value `0x810`. Effectively,= this clears the address range `0x800`-`0x80f`. This range is used as a (fa= ke) partition table for writing the MBR back to disk. Finally, the sector f= ield for the CHS addressing of this fake partition is given the value 1 and= a jump is made to the main function from the relocated code. Note that unt= il this jump to the relocated code, any reference to an absolute address wa= s avoided. > +Next, the destination register `%di` is copied to `%bp`. `%bp` gets the = value `0x800`. The value `8` is copied to `%cl` in preparation for a new st= ring operation (like our previous `movsw`). Now, `stosw` is executed 8 time= s. This instruction copies a `0` value to the address pointed to by the des= tination register (`%di`, which is `0x800`), and increments it. This is rep= eated another 7 times, so `%di` ends up with value `0x810`. Effectively, th= is clears the address range `0x800`-`0x80f`. This range is used as a (fake)= partition table for writing the MBR back to disk. Finally, the sector fiel= d for the CHS addressing of this fake partition is given the value 1 and a = jump is made to the main function from the relocated code. Note that until = this jump to the relocated code, any reference to an absolute address was a= voided. > > The following code block tests whether the drive number provided by the = BIOS should be used, or the one stored in [.filename]#boot0#. > > [.programlisting] > .... > main: > - testb $SETDRV,-69(%bp) # Set drive number? > + testb $SETDRV,_FLAGS(%bp) # Set drive number? > +#ifndef CHECK_DRIVE /* disable drive checks */ > + jz save_curdrive # no, use the default > +#else > jnz disable_update # Yes > testb %dl,%dl # Drive number valid? > js save_curdrive # Possibly (0x80 set) > +#endif > .... > > -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-drivenumber]] > +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-drivenumber]] > This code tests the `SETDRV` bit (`0x20`) in the _flags_ variable. Recal= l that register `%bp` points to address location `0x800`, so the test is do= ne to the _flags_ variable at address `0x800`-`69`=3D `0x7bb`. This is an e= xample of the type of modifications that can be done to [.filename]#boot0#.= The `SETDRV` flag is not set by default, but it can be set in the [.filena= me]#Makefile#. When set, the drive number stored in the MBR is used instead= of the one provided by the BIOS. We assume the defaults, and that the BIOS= provided a valid drive number, so we jump to `save_curdrive`. > > The next block saves the drive number provided by the BIOS, and calls `p= utn` to print a new line on the screen. > @@ -242,7 +245,7 @@ save_curdrive: > callw putn # Print a newline > .... > > -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-savedrivenumber]] > +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-savedrivenumber]] > Note that we assume `TEST` is not defined, so the conditional code in it= is not assembled and will not appear in our executable [.filename]#boot0#. > > Our next block implements the actual scanning of the partition table. It= prints to the screen the partition type for each of the four entries in th= e partition table. It compares each type with a list of well-known operatin= g system file systems. Examples of recognized partition types are NTFS (Win= dows(R), ID 0x7), `ext2fs` (Linux(R), ID 0x83), and, of course, `ffs`/`ufs2= ` (FreeBSD, ID 0xa5). The implementation is fairly simple. > @@ -274,7 +277,7 @@ next_entry: > jnc read_entry # Till done > .... > > -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-partition-scan]] > +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-partition-scan]] > It is important to note that the active flag for each entry is cleared, = so after the scanning, _no_ partition entry is active in our memory copy of= [.filename]#boot0#. Later, the active flag will be set for the selected pa= rtition. This ensures that only one active partition exists if the user cho= oses to write the changes back to disk. > > The next block tests for other drives. At startup, the BIOS writes the n= umber of drives present in the computer to address `0x475`. If there are an= y other drives present, [.filename]#boot0# prints the current drive to scre= en. The user may command [.filename]#boot0# to scan partitions on another d= rive later. > @@ -282,14 +285,14 @@ The next block tests for other drives. At startup, = the BIOS writes the number of > [.programlisting] > .... > popw %ax # Drive number > - subb $0x79,%al # Does next > - cmpb 0x475,%al # drive exist? (from BIOS?) > + subb $0x80-0x1,%al # Does next > + cmpb NHRDRV,%al # drive exist? (from BIOS?) > jb print_drive # Yes > decw %ax # Already drive 0? > jz print_prompt # Yes > .... > > -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-test-drives]] > +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-test-drives]] > We make the assumption that a single drive is present, so the jump to `p= rint_drive` is not performed. We also assume nothing strange happened, so w= e jump to `print_prompt`. > > This next block just prints out a prompt followed by the default option: > @@ -305,7 +308,7 @@ print_prompt: > jmp start_input # Skip beep > .... > > -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-prompt]] > +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-prompt]] > Finally, a jump is performed to `start_input`, where the BIOS services a= re used to start a timer and for reading user input from the keyboard; if t= he timer expires, the default option will be selected: > > [.programlisting] > @@ -325,7 +328,7 @@ read_key: > jb read_key # No > .... > > -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-start-input]] > +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-start-input]] > An interrupt is requested with number `0x1a` and argument `0` in registe= r `%ah`. The BIOS has a predefined set of services, requested by applicatio= ns as software-generated interrupts through the `int` instruction and recei= ving arguments in registers (in this case, `%ah`). Here, particularly, we a= re requesting the number of clock ticks since last midnight; this value is = computed by the BIOS through the RTC (Real Time Clock). This clock can be p= rogrammed to work at frequencies ranging from 2 Hz to 8192 Hz. The BIOS set= s it to 18.2 Hz at startup. When the request is satisfied, a 32-bit result = is returned by the BIOS in registers `%cx` and `%dx` (lower bytes in `%dx`)= . This result (the `%dx` part) is copied to register `%di`, and the value o= f the `TICKS` variable is added to `%di`. This variable resides in [.filena= me]#boot0# at offset `_TICKS` (a negative value) from register `%bp` (which= , recall, points to `0x800`). The default value of this variable is `0xb6` = (182 in decimal). Now, th > e idea is that [.filename]#boot0# constantly requests the time from the = BIOS, and when the value returned in register `%dx` is greater than the val= ue stored in `%di`, the time is up and the default selection will be made. = Since the RTC ticks 18.2 times per second, this condition will be met after= 10 seconds (this default behavior can be changed in the [.filename]#Makefi= le#). Until this time has passed, [.filename]#boot0# continually asks the B= IOS for any user input; this is done through `int 0x16`, argument `1` in `%= ah`. > > Whether a key was pressed or the time expired, subsequent code validates= the selection. Based on the selection, the register `%si` is set to point = to the appropriate partition entry in the partition table. This new selecti= on overrides the previous default one. Indeed, it becomes the new default. = Finally, the ACTIVE flag of the selected partition is set. If it was enable= d at compile time, the in-memory version of [.filename]#boot0# with these m= odified values is written back to the MBR on disk. We leave the details of = this implementation to the reader. > @@ -334,11 +337,11 @@ We now end our study with the last code block from = the [.filename]#boot0# progra > > [.programlisting] > .... > - movw $0x7c00,%bx # Address for read > + movw $LOAD,%bx # Address for read > movb $0x2,%ah # Read sector > callw intx13 # from disk > jc beep # If error > - cmpw $0xaa55,0x1fe(%bx) # Bootable? > + cmpw $MAGIC,0x1fe(%bx) # Bootable? > jne beep # No > pushw %si # Save ptr to selected part. > callw putn # Leave some space > @@ -346,7 +349,7 @@ We now end our study with the last code block from th= e [.filename]#boot0# progra > jmp *%bx # Invoke bootstrap > .... > > -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-check-bootable]] > +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-check-bootable]] > Recall that `%si` points to the selected partition entry. This entry tel= ls us where the partition begins on disk. We assume, of course, that the pa= rtition selected is actually a FreeBSD slice. > > [NOTE] > @@ -376,7 +379,7 @@ start: > jmp main > .... > > -.[.filename]#sys/boot/i386/boot2/boot1.S# [[boot-boot1-entry]] > +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-entry]] > The entry point at `start` simply jumps past a special data area to the = label `main`, which in turn looks like this: > > [.programlisting] > @@ -389,13 +392,13 @@ main: > mov %cx,%ss # Set up > mov $start,%sp # stack > mov %sp,%si # Source > - mov $0x700,%di # Destination > + mov $MEM_REL,%di # Destination > incb %ch # Word count > rep # Copy > movsw # code > .... > > -.[.filename]#sys/boot/i386/boot2/boot1.S# [[boot-boot1-main]] > +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-main]] > Just like [.filename]#boot0#, this code relocates [.filename]#boot1#, th= is time to memory address `0x700`. However, unlike [.filename]#boot0#, it d= oes not jump there. [.filename]#boot1# is linked to execute at address `0x7= c00`, effectively where it was loaded in the first place. The reason for th= is relocation will be discussed shortly. > > Next comes a loop that looks for the FreeBSD slice. Although [.filename]= #boot0# loaded [.filename]#boot1# from the FreeBSD slice, no information wa= s passed to it about this footnote:[Actually we did pass a pointer to the s= lice entry in register %si. However, boot1 does not assume that it was load= ed by boot0 (perhaps some other MBR loaded it, and did not pass this inform= ation), so it assumes nothing.], so [.filename]#boot1# must rescan the part= ition table to find where the FreeBSD slice starts. Therefore it rereads th= e MBR: > @@ -409,7 +412,7 @@ Next comes a loop that looks for the FreeBSD slice. A= lthough [.filename]#boot0# > callw nread # Read MBR > .... > > -.[.filename]#sys/boot/i386/boot2/boot1.S# [[boot-boot1-find-freebsd]] > +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-find-freebsd]] > In the code above, register `%dl` maintains information about the boot d= evice. This is passed on by the BIOS and preserved by the MBR. Numbers `0x8= 0` and greater tells us that we are dealing with a hard drive, so a call is= made to `nread`, where the MBR is read. Arguments to `nread` are passed th= rough `%si` and `%dh`. The memory address at label `part4` is copied to `%s= i`. This memory address holds a "fake partition" to be used by `nread`. The= following is the data in the fake partition: > > [.programlisting] > @@ -421,7 +424,7 @@ In the code above, register `%dl` maintains informati= on about the boot device. T > .byte 0x50, 0xc3, 0x00, 0x00 > .... > > -.[.filename]#sys/boot/i386/boot2/Makefile# [[boot-boot2-make-fake-partit= ion]] > +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot2-make-fake-partition]= ] > In particular, the LBA for this fake partition is hardcoded to zero. Thi= s is used as an argument to the BIOS for reading absolute sector one from t= he hard drive. Alternatively, CHS addressing could be used. In this case, t= he fake partition holds cylinder 0, head 0 and sector 1, which is equivalen= t to absolute sector one. > > Let us now proceed to take a look at `nread`: > @@ -429,7 +432,7 @@ Let us now proceed to take a look at `nread`: > [.programlisting] > .... > nread: > - mov $0x8c00,%bx # Transfer buffer > + mov $MEM_BUF,%bx # Transfer buffer > mov 0x8(%si),%ax # Get > mov 0xa(%si),%cx # LBA > push %cs # Read from > @@ -437,7 +440,7 @@ nread: > jnc return # If success, return > .... > > -.[.filename]#sys/boot/i386/boot2/boot1.S# [[boot-boot1-nread]] > +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-nread]] > Recall that `%si` points to the fake partition. The word footnote:[In th= e context of 16-bit real mode, a word is 2 bytes.] at offset `0x8` is copie= d to register `%ax` and word at offset `0xa` to `%cx`. They are interpreted= by the BIOS as the lower 4-byte value denoting the LBA to be read (the upp= er four bytes are assumed to be zero). Register `%bx` holds the memory addr= ess where the MBR will be loaded. The instruction pushing `%cs` onto the st= ack is very interesting. In this context, it accomplishes nothing. However,= as we will see shortly, [.filename]#boot2#, in conjunction with the BTX se= rver, also uses `xread.1`. This mechanism will be discussed in the next sec= tion. > > The code at `xread.1` further calls the `read` function, which actually = calls the BIOS asking for the disk sector: > @@ -460,7 +463,7 @@ xread.1: > lret # To far caller > .... > > -.[.filename]#sys/boot/i386/boot2/boot1.S# [[boot-boot1-xread1]] > +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-xread1]] > Note the long return instruction at the end of this block. This instruct= ion pops out the `%cs` register pushed by `nread`, and returns. Finally, `n= read` also returns. > > With the MBR loaded to memory, the actual loop for searching the FreeBSD= slice begins: > @@ -469,10 +472,10 @@ With the MBR loaded to memory, the actual loop for = searching the FreeBSD slice b > .... > mov $0x1,%cx # Two passes > main.1: > - mov $0x8dbe,%si # Partition table > + mov $MEM_BUF+PRT_OFF,%si # Partition table > movb $0x1,%dh # Partition > main.2: > - cmpb $0xa5,0x4(%si) # Our partition type? > + cmpb $PRT_BSD,0x4(%si) # Our partition type? > jne main.3 # No > jcxz main.5 # If second pass > testb $0x80,(%si) # Active? > @@ -480,32 +483,32 @@ main.2: > main.3: > add $0x10,%si # Next entry > incb %dh # Partition > - cmpb $0x5,%dh # In table? > + cmpb $0x1+PRT_NUM,%dh # In table? > jb main.2 # Yes > dec %cx # Do two > jcxz main.1 # passes > .... > > -.[.filename]#sys/boot/i386/boot2/boot1.S# [[boot-boot1-find-part]] > +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-find-part]] > If a FreeBSD slice is identified, execution continues at `main.5`. Note = that when a FreeBSD slice is found `%si` points to the appropriate entry in= the partition table, and `%dh` holds the partition number. We assume that = a FreeBSD slice is found, so we continue execution at `main.5`: > > [.programlisting] > .... > main.5: > - mov %dx,0x900 # Save args > - movb $0x10,%dh # Sector count > + mov %dx,MEM_ARG # Save args > + movb $NSECT,%dh # Sector count > callw nread # Read disk > - mov $0x9000,%bx # BTX > + mov $MEM_BTX,%bx # BTX > mov 0xa(%bx),%si # Get BTX length and set > add %bx,%si # %si to start of boot2.bin > - mov $0xc000,%di # Client page 2 > - mov $0xa200,%cx # Byte > + mov $MEM_USR+SIZ_PAG*2,%di # Client page = 2 > + mov $MEM_BTX+(NSECT-1)*SIZ_SEC,%cx # Byte > sub %si,%cx # count > rep # Relocate > movsb # client > .... > > -.[.filename]#sys/boot/i386/boot2/boot1.S# [[boot-boot1-main5]] > +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-main5]] > Recall that at this point, register `%si` points to the FreeBSD slice en= try in the MBR partition table, so a call to `nread` will effectively read = sectors at the beginning of this partition. The argument passed on register= `%dh` tells `nread` to read 16 disk sectors. Recall that the first 512 byt= es, or the first sector of the FreeBSD slice, coincides with the [.filename= ]#boot1# program. Also recall that the file written to the beginning of the= FreeBSD slice is not [.filename]#/boot/boot1#, but [.filename]#/boot/boot#= . Let us look at the size of these files in the filesystem: > > [source,bash] > @@ -550,7 +553,7 @@ seta20.3: > jmp 0x9010 # Start BTX > .... > > -.[.filename]#sys/boot/i386/boot2/boot1.S# [[boot-boot1-seta20]] > +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-seta20]] > Note that right before the jump, interrupts are enabled. > > [[btx-server]] > @@ -562,7 +565,7 @@ Next in our boot sequence is the BTX Server. Let us q= uickly remember how we got > * [.filename]#boot0# relocates itself to `0x600`, the address it was lin= ked to execute, and jumps over there. It then reads the first sector of the= FreeBSD slice (which consists of [.filename]#boot1#) into address `0x7c00`= and jumps over there. > * [.filename]#boot1# loads the first 16 sectors of the FreeBSD slice int= o address `0x8c00`. This 16 sectors, or 8192 bytes, is the whole file [.fil= ename]#boot#. The file is a concatenation of [.filename]#boot1# and [.filen= ame]#boot2#. [.filename]#boot2#, in turn, contains the BTX server and the [= .filename]#boot2# client. Finally, a jump is made to address `0x9010`, the = entry point of the BTX server. > > -Before studying the BTX Server in detail, let us further review how the = single, all-in-one [.filename]#boot# file is created. The way [.filename]#b= oot# is built is defined in its [.filename]#Makefile# ([.filename]#/usr/src= /sys/boot/i386/boot2/Makefile#). Let us look at the rule that creates the [= .filename]#boot# file: > +Before studying the BTX Server in detail, let us further review how the = single, all-in-one [.filename]#boot# file is created. The way [.filename]#b= oot# is built is defined in its [.filename]#Makefile# ([.filename]#stand/i3= 86/boot2/Makefile#). Let us look at the rule that creates the [.filename]#b= oot# file: > > [.programlisting] > .... > @@ -570,19 +573,19 @@ Before studying the BTX Server in detail, let us fu= rther review how the single, > cat boot1 boot2 > boot > .... > > -.[.filename]#sys/boot/i386/boot2/Makefile# [[boot-boot1-make-boot]] > +.[.filename]#stand/i386/boot2/Makefile# [[boot-boot1-make-boot]] > This tells us that [.filename]#boot1# and [.filename]#boot2# are needed,= and the rule simply concatenates them to produce a single file called [.fi= lename]#boot#. The rules for creating [.filename]#boot1# are also quite sim= ple: > > [.programlisting] > .... > boot1: boot1.out > - objcopy -S -O binary boot1.out boot1 > + ${OBJCOPY} -S -O binary boot1.out ${.TARGET} > > boot1.out: boot1.o > - ld -e start -Ttext 0x7c00 -o boot1.out boot1.o > + ${LD} ${LD_FLAGS} -e start --defsym ORG=3D${ORG1} -T ${LDSCRIPT} = -o ${.TARGET} boot1.o > .... > > -.[.filename]#sys/boot/i386/boot2/Makefile# [[boot-boot1-make-boot1]] > +.[.filename]#stand/i386/boot2/Makefile# [[boot-boot1-make-boot1]] > To apply the rule for creating [.filename]#boot1#, [.filename]#boot1.out= # must be resolved. This, in turn, depends on the existence of [.filename]#= boot1.o#. This last file is simply the result of assembling our familiar [.= filename]#boot1.S#, without linking. Now, the rule for creating [.filename]= #boot1.out# is applied. This tells us that [.filename]#boot1.o# should be l= inked with `start` as its entry point, and starting at address `0x7c00`. Fi= nally, [.filename]#boot1# is created from [.filename]#boot1.out# applying t= he appropriate rule. This rule is the [.filename]#objcopy# command applied = to [.filename]#boot1.out#. Note the flags passed to [.filename]#objcopy#: `= -S` tells it to strip all relocation and symbolic information; `-O binary` = indicates the output format, that is, a simple, unformatted binary file. > > Having [.filename]#boot1#, let us take a look at how [.filename]#boot2# = is constructed: > @@ -590,30 +593,22 @@ Having [.filename]#boot1#, let us take a look at ho= w [.filename]#boot2# is const > [.programlisting] > .... > boot2: boot2.ld > - @set -- `ls -l boot2.ld`; x=3D$$((7680-$$5)); \ > + @set -- `ls -l ${.ALLSRC}`; x=3D$$((${BOOT2SIZE}-$$5)); \ > echo "$$x bytes available"; test $$x -ge 0 > - dd if=3Dboot2.ld of=3Dboot2 obs=3D7680 conv=3Dosync > + ${DD} if=3D${.ALLSRC} of=3D${.TARGET} bs=3D${BOOT2SIZE} conv=3Dsy= nc > > - boot2.ld: boot2.ldr boot2.bin ../btx/btx/btx > - btxld -v -E 0x2000 -f bin -b ../btx/btx/btx -l boot2.ldr \ > - -o boot2.ld -P 1 boot2.bin > + boot2.ld: boot2.ldr boot2.bin ${BTXKERN} > + btxld -v -E ${ORG2} -f bin -b ${BTXKERN} -l boot2.ldr \ > + -o ${.TARGET} -P 1 boot2.bin > > boot2.ldr: > - dd if=3D/dev/zero of=3Dboot2.ldr bs=3D512 count=3D1 > + ${DD} if=3D/dev/zero of=3D${.TARGET} bs=3D512 count=3D1 > > boot2.bin: boot2.out > - objcopy -S -O binary boot2.out boot2.bin > + ${OBJCOPY} -S -O binary boot2.out ${.TARGET} > > - boot2.out: ../btx/lib/crt0.o boot2.o sio.o > - ld -Ttext 0x2000 -o boot2.out > - > - boot2.o: boot2.s > - ${CC} ${ACFLAGS} -c boot2.s > - > - boot2.s: boot2.c boot2.h ${.CURDIR}/../../common/ufsread.c > - ${CC} ${CFLAGS} -S -o boot2.s.tmp ${.CURDIR}/boot2.c > - sed -e '/align/d' -e '/nop/d' "MISSING" boot2.s.tmp > boot2.s > - rm -f boot2.s.tmp > + boot2.out: ${BTXCRT} boot2.o sio.o ashldi3.o > + ${LD} ${LD_FLAGS} --defsym ORG=3D${ORG2} -T ${LDSCRIPT} -o ${.TAR= GET} ${.ALLSRC} > > boot2.h: boot1.out > ${NM} -t d ${.ALLSRC} | awk '/([0-9])+ T xread/ \ > @@ -623,21 +618,19 @@ Having [.filename]#boot1#, let us take a look at ho= w [.filename]#boot2# is const > REL1=3D`printf "%d" ${REL1}` > ${.TARGET} > .... > > -.[.filename]#sys/boot/i386/boot2/Makefile# [[boot-boot1-make-boot2]] > +.[.filename]#stand/i386/boot2/Makefile# [[boot-boot1-make-boot2]] > The mechanism for building [.filename]#boot2# is far more elaborate. Let= us point out the most relevant facts. The dependency list is as follows: > > [.programlisting] > .... > boot2: boot2.ld > - boot2.ld: boot2.ldr boot2.bin ${BTXDIR}/btx/btx > + boot2.ld: boot2.ldr boot2.bin ${BTXDIR} > boot2.bin: boot2.out > - boot2.out: ${BTXDIR}/lib/crt0.o boot2.o sio.o > - boot2.o: boot2.s > - boot2.s: boot2.c boot2.h ${.CURDIR}/../../common/ufsread.c > + boot2.out: ${BTXDIR} boot2.o sio.o ashldi3.o > boot2.h: boot1.out > .... > > -.[.filename]#sys/boot/i386/boot2/Makefile# [[boot-boot1-make-boot2-more]= ] > +.[.filename]#stand/i386/boot2/Makefile# [[boot-boot1-make-boot2-more]] > Note that initially there is no header file [.filename]#boot2.h#, but it= s creation depends on [.filename]#boot1.out#, which we already have. The ru= le for its creation is a bit terse, but the important thing is that the out= put, [.filename]#boot2.h#, is something like this: > > [.programlisting] > @@ -645,12 +638,12 @@ Note that initially there is no header file [.filen= ame]#boot2.h#, but its creati > #define XREADORG 0x725 > .... > > -.[.filename]#sys/boot/i386/boot2/boot2.h# [[boot-boot1-make-boot2h]] > +.[.filename]#stand/i386/boot2/boot2.h# [[boot-boot1-make-boot2h]] > Recall that [.filename]#boot1# was relocated (i.e., copied from `0x7c00`= to `0x700`). This relocation will now make sense, because as we will see, = the BTX server reclaims some memory, including the space where [.filename]#= boot1# was originally loaded. However, the BTX server needs access to [.fil= ename]#boot1#'s `xread` function; this function, according to the output of= [.filename]#boot2.h#, is at location `0x725`. Indeed, the BTX server uses = the `xread` function from [.filename]#boot1#'s relocated code. This functio= n is now accessible from within the [.filename]#boot2# client. > > -We next build [.filename]#boot2.s# from files [.filename]#boot2.h#, [.fi= lename]#boot2.c# and [.filename]#/usr/src/sys/boot/common/ufsread.c#. The r= ule for this is to compile the code in [.filename]#boot2.c# (which includes= [.filename]#boot2.h# and [.filename]#ufsread.c#) into assembly code. Havin= g [.filename]#boot2.s#, the next rule assembles [.filename]#boot2.s#, creat= ing the object file [.filename]#boot2.o#. The next rule directs the linker = to link various files ([.filename]#crt0.o#, [.filename]#boot2.o# and [.file= name]#sio.o#). Note that the output file, [.filename]#boot2.out#, is linked= to execute at address `0x2000`. Recall that [.filename]#boot2# will be exe= cuted in user mode, within a special user segment set up by the BTX server.= This segment starts at `0xa000`. Also, remember that the [.filename]#boot2= # portion of [.filename]#boot# was copied to address `0xc000`, that is, off= set `0x2000` from the start of the user segment, so [.filename]#boot2# will= work properly when we tr > ansfer control to it. Next, [.filename]#boot2.bin# is created from [.fil= ename]#boot2.out# by stripping its symbols and format information; boot2.bi= n is a _raw_ binary. Now, note that a file [.filename]#boot2.ldr# is create= d as a 512-byte file full of zeros. This space is reserved for the bsdlabel= . > +The next rule directs the linker to link various files ([.filename]#ashl= di3.o#, [.filename]#boot2.o# and [.filename]#sio.o#). Note that the output = file, [.filename]#boot2.out#, is linked to execute at address `0x2000` (${O= RG2}). Recall that [.filename]#boot2# will be executed in user mode, within= a special user segment set up by the BTX server. This segment starts at `0= xa000`. Also, remember that the [.filename]#boot2# portion of [.filename]#b= oot# was copied to address `0xc000`, that is, offset `0x2000` from the star= t of the user segment, so [.filename]#boot2# will work properly when we tra= nsfer control to it. Next, [.filename]#boot2.bin# is created from [.filenam= e]#boot2.out# by stripping its symbols and format information; boot2.bin is= a _raw_ binary. Now, note that a file [.filename]#boot2.ldr# is created as= a 512-byte file full of zeros. This space is reserved for the bsdlabel. > > -Now that we have files [.filename]#boot1#, [.filename]#boot2.bin# and [.= filename]#boot2.ldr#, only the BTX server is missing before creating the al= l-in-one [.filename]#boot# file. The BTX server is located in [.filename]#/= usr/src/sys/boot/i386/btx/btx#; it has its own [.filename]#Makefile# with i= ts own set of rules for building. The important thing to notice is that it = is also compiled as a _raw_ binary, and that it is linked to execute at add= ress `0x9000`. The details can be found in [.filename]#/usr/src/sys/boot/i3= 86/btx/btx/Makefile#. > +Now that we have files [.filename]#boot1#, [.filename]#boot2.bin# and [.= filename]#boot2.ldr#, only the BTX server is missing before creating the al= l-in-one [.filename]#boot# file. The BTX server is located in [.filename]#s= tand/i386/btx/btx#; it has its own [.filename]#Makefile# with its own set o= f rules for building. The important thing to notice is that it is also comp= iled as a _raw_ binary, and that it is linked to execute at address `0x9000= `. The details can be found in [.filename]#stand/i386/btx/btx/Makefile#. > > Having the files that comprise the [.filename]#boot# program, the final = step is to _merge_ them. This is done by a special program called [.filenam= e]#btxld# (source located in [.filename]#/usr/src/usr.sbin/btxld#). Some ar= guments to this program include the name of the output file ([.filename]#bo= ot#), its entry point (`0x2000`) and its file format (raw binary). The vari= ous files are finally merged by this utility into the file [.filename]#boot= #, which consists of [.filename]#boot1#, [.filename]#boot2#, the `bsdlabel`= and the BTX server. This file, which takes exactly 16 sectors, or 8192 byt= es, is what is actually written to the beginning of the FreeBSD slice durin= g installation. Let us now proceed to study the BTX server program. > > @@ -680,7 +673,7 @@ btx_hdr: .byte 0xeb # Machine= ID > .long 0x0 # Entry address > .... > > -.[.filename]#sys/boot/i386/btx/btx/btx.S# [[btx-header]] > +.[.filename]#stand/i386/btx/btx/btx.S# [[btx-header]] > Note the first two bytes are `0xeb` and `0xe`. In the IA-32 architecture= , these two bytes are interpreted as a relative jump past the header into t= he entry point, so in theory, [.filename]#boot1# could jump here (address `= 0x9000`) instead of address `0x9010`. Note that the last field in the BTX h= eader is a pointer to the client's ([.filename]#boot2#) entry point. This f= ield is patched at link time. > > Immediately following the header is the BTX server's entry point: > @@ -693,14 +686,14 @@ Immediately following the header is the BTX server'= s entry point: > init: cli # Disable interrupts > xor %ax,%ax # Zero/segment > mov %ax,%ss # Set up > - mov $0x1800,%sp # stack > + mov $MEM_ESP0,%sp # stack > mov %ax,%es # Address > mov %ax,%ds # data > pushl $0x2 # Clear > popfl # flags > .... > > -.[.filename]#sys/boot/i386/btx/btx/btx.S# [[btx-init]] > +.[.filename]#stand/i386/btx/btx/btx.S# [[btx-init]] > This code disables interrupts, sets up a working stack (starting at addr= ess `0x1800`) and clears the flags in the EFLAGS register. Note that the `p= opfl` instruction pops out a doubleword (4 bytes) from the stack and places= it in the EFLAGS register. As the value actually popped is `2`, the EFLAGS= register is effectively cleared (IA-32 requires that bit 2 of the EFLAGS r= egister always be 1). > > Our next code block clears (sets to `0`) the memory range `0x5e00-0x8fff= `. This range is where the various data structures will be created: > @@ -710,13 +703,13 @@ Our next code block clears (sets to `0`) the memory= range `0x5e00-0x8fff`. This > /* > * Initialize memory. > */ > - mov $0x5e00,%di # Memory to initialize > - mov $(0x9000-0x5e00)/2,%cx # Words to zero > + mov $MEM_IDT,%di # Memory to initialize > + mov $(MEM_ORG-MEM_IDT)/2,%cx # Words to zero > rep # Zero-fill > stosw # memory > .... > > -.[.filename]#sys/boot/i386/btx/btx/btx.S# [[btx-clear-mem]] > +.[.filename]#stand/i386/btx/btx/btx.S# [[btx-clear-mem]] > Recall that [.filename]#boot1# was originally loaded to address `0x7c00`= , so, with this memory initialization, that copy effectively disappeared. H= owever, also recall that [.filename]#boot1# was relocated to `0x700`, so _t= hat_ copy is still in memory, and the BTX server will make use of it. > > Next, the real-mode IVT (Interrupt Vector Table is updated. The IVT is a= n array of segment/offset pairs for exception and interrupt handlers. The B= IOS normally maps hardware interrupts to interrupt vectors `0x8` to `0xf` a= nd `0x70` to `0x77` but, as will be seen, the 8259A Programmable Interrupt = Controller, the chip controlling the actual mapping of hardware interrupts = to interrupt vectors, is programmed to remap these interrupt vectors from `= 0x8-0xf` to `0x20-0x27` and from `0x70-0x77` to `0x28-0x2f`. Thus, interrup= t handlers are provided for interrupt vectors `0x20-0x2f`. The reason the B= IOS-provided handlers are not used directly is because they work in 16-bit = real mode, but not 32-bit protected mode. Processor mode will be switched t= o 32-bit protected mode shortly. However, the BTX server sets up a mechanis= m to effectively use the handlers provided by the BIOS: > @@ -737,7 +730,7 @@ init.0: mov %bx,(%di) #= Store IP > loop init.0 # Next IRQ > .... > > -.[.filename]#sys/boot/i386/btx/btx/btx.S# [[btx-ivt]] > +.[.filename]#stand/i386/btx/btx/btx.S# [[btx-ivt]] > The next block creates the IDT (Interrupt Descriptor Table). The IDT is = analogous, in protected mode, to the IVT in real mode. That is, the IDT des= cribes the various exception and interrupt handlers used when the processor= is executing in protected mode. In essence, it also consists of an array o= f segment/offset pairs, although the structure is somewhat more complex, be= cause segments in protected mode are different than in real mode, and vario= us protection mechanisms apply: > > [.programlisting] > @@ -745,7 +738,7 @@ The next block creates the IDT (Interrupt Descriptor = Table). The IDT is analogou > /* > * Create IDT. > */ > - mov $0x5e00,%di # IDT's address > + mov $MEM_IDT,%di # IDT's address > mov $idtctl,%si # Control string > init.1: lodsb # Get entry > cbw # count > @@ -768,7 +761,7 @@ init.3: lea 0x8(%di),%di #= Next entry > jmp init.1 # Continue > .... > > -.[.filename]#sys/boot/i386/btx/btx/btx.S# [[btx-idt]] > +.[.filename]#stand/i386/btx/btx/btx.S# [[btx-idt]] > Each entry in the `IDT` is 8 bytes long. Besides the segment/offset info= rmation, they also describe the segment type, privilege level, and whether = the segment is present in memory or not. The construction is such that inte= rrupt vectors from `0` to `0xf` (exceptions) are handled by function `intx0= 0`; vector `0x10` (also an exception) is handled by `intx10`; hardware inte= rrupts, which are later configured to start at interrupt vector `0x20` all = the way to interrupt vector `0x2f`, are handled by function `intx20`. Lastl= y, interrupt vector `0x30`, which is used for system calls, is handled by `= intx30`, and vectors `0x31` and `0x32` are handled by `intx31`. It must be = noted that only descriptors for interrupt vectors `0x30`, `0x31` and `0x32`= are given privilege level 3, the same privilege level as the [.filename]#b= oot2# client, which means the client can execute a software-generated inter= rupt to this vectors through the `int` instruction without failing (this is= the way [.filename]#boot > 2# use the services provided by the BTX server). Also, note that _only_ = software-generated interrupts are protected from code executing in lesser p= rivilege levels. Hardware-generated interrupts and processor-generated exce= ptions are _always_ handled adequately, regardless of the actual privileges= involved. > > The next step is to initialize the TSS (Task-State Segment). The TSS is = a hardware feature that helps the operating system or executive software im= plement multitasking functionality through process abstraction. The IA-32 a= rchitecture demands the creation and use of _at least_ one TSS if multitask= ing facilities are used or different privilege levels are defined. Since th= e [.filename]#boot2# client is executed in privilege level 3, but the BTX s= erver runs in privilege level 0, a TSS must be defined: > @@ -783,7 +776,7 @@ init.4: movb $_ESP0H,TSS_ESP0+1(%di) #= Set ESP0 > movb $_TSSIO,TSS_MAP(%di) # Set I/O bit map base > .... > > -.[.filename]#sys/boot/i386/btx/btx/btx.S# [[btx-tss]] > +.[.filename]#stand/i386/btx/btx/btx.S# [[btx-tss]] > Note that a value is given for the Privilege Level 0 stack pointer and s= tack segment in the TSS. This is needed because, if an interrupt or excepti= on is received while executing [.filename]#boot2# in Privilege Level 3, a c= hange to Privilege Level 0 is automatically performed by the processor, so = a new working stack is needed. Finally, the I/O Map Base Address field of t= he TSS is given a value, which is a 16-bit offset from the beginning of the= TSS to the I/O Permission Bitmap and the Interrupt Redirection Bitmap. > > After the IDT and TSS are created, the processor is ready to switch to p= rotected mode. This is done in the next block: > @@ -807,7 +800,7 @@ init.8: xorl %ecx,%ecx #= Zero > movw %cx,%ss # stack > .... > > -.[.filename]#sys/boot/i386/btx/btx/btx.S# [[btx-prot]] > +.[.filename]#stand/i386/btx/btx/btx.S# [[btx-prot]] > First, a call is made to `setpic` to program the 8259A PIC (Programmable= Interrupt Controller). This chip is connected to multiple hardware interru= pt sources. Upon receiving an interrupt from a device, it signals the proce= ssor with the appropriate interrupt vector. This can be customized so that = specific interrupts are associated with specific interrupt vectors, as expl= ained before. Next, the IDTR (Interrupt Descriptor Table Register) and GDTR= (Global Descriptor Table Register) are loaded with the instructions `lidt`= and `lgdt`, respectively. These registers are loaded with the base address= and limit address for the IDT and GDT. The following three instructions se= t the Protection Enable (PE) bit of the `%cr0` register. This effectively s= witches the processor to 32-bit protected mode. Next, a long jump is made t= o `init.8` using segment selector SEL_SCODE, which selects the Supervisor C= ode Segment. The processor is effectively executing in CPL 0, the most priv= ileged level, after this > jump. Finally, the Supervisor Data Segment is selected for the stack by = assigning the segment selector SEL_SDATA to the `%ss` register. This data s= egment also has a privilege level of `0`. > > Our last code block is responsible for loading the TR (Task Register) wi= th the segment selector for the TSS we created earlier, and setting the Use= r Mode environment before passing execution control to the [.filename]#boot= 2# client. > @@ -819,7 +812,7 @@ Our last code block is responsible for loading the TR= (Task Register) with the s > */ > movb $SEL_TSS,%cl # Set task > ltr %cx # register > - movl $0xa000,%edx # User base address > + movl $MEM_USR,%edx # User base address > movzwl %ss:BDA_MEM,%eax # Get free memory > shll $0xa,%eax # To bytes > subl $ARGSPACE,%eax # Less arg space > @@ -838,6 +831,9 @@ Our last code block is responsible for loading the TR= (Task Register) with the s > movb $0x7,%cl # Set remaining > init.9: push $0x0 # general > loop init.9 # registers > +#ifdef BTX_SERIAL > + call sio_init # setup the serial consol= e > +#endif > popa # and initialize > popl %es # Initialize > popl %ds # user > @@ -846,7 +842,7 @@ init.9: push $0x0 #= general > iret # To user mode > .... > > -.[.filename]#sys/boot/i386/btx/btx/btx.S# [[btx-end]] > +.[.filename]#stand/i386/btx/btx/btx.S# [[btx-end]] > Note that the client's environment include a stack segment selector and = stack pointer (registers `%ss` and `%esp`). Indeed, once the TR is loaded w= ith the appropriate stack segment selector (instruction `ltr`), the stack p= ointer is calculated and pushed onto the stack along with the stack's segme= nt selector. Next, the value `0x202` is pushed onto the stack; it is the va= lue that the EFLAGS will get when control is passed to the client. Also, th= e User Mode code segment selector and the client's entry point are pushed. = Recall that this entry point is patched in the BTX header at link time. Fin= ally, segment selectors (stored in register `%ecx`) for the segment registe= rs `%gs, %fs, %ds and %es` are pushed onto the stack, along with the value = at `%edx` (`0xa000`). Keep in mind the various values that have been pushed= onto the stack (they will be popped out shortly). Next, values for the rem= aining general purpose registers are also pushed onto the stack (note the `= loop` that pushes the val > ue `0` seven times). Now, values will be started to be popped out of the= stack. First, the `popa` instruction pops out of the stack the latest seve= n values pushed. They are stored in the general purpose registers in order = `%edi, %esi, %ebp, %ebx, %edx, %ecx, %eax`. Then, the various segment selec= tors pushed are popped into the various segment registers. Five values stil= l remain on the stack. They are popped when the `iret` instruction is execu= ted. This instruction first pops the value that was pushed from the BTX hea= der. This value is a pointer to [.filename]#boot2#'s entry point. It is pla= ced in the register `%eip`, the instruction pointer register. Next, the seg= ment selector for the User Code Segment is popped and copied to register `%= cs`. Remember that this segment's privilege level is 3, the least privilege= d level. This means that we must provide values for the stack of this privi= lege level. This is why the processor, besides further popping the value fo= r the EFLAGS register, do > es two more pops out of the stack. These val! > ues go to the stack pointer (`%esp`) and the stack segment (`%ss`). Now,= execution continues at ``boot0``'s entry point. > > It is important to note how the User Code Segment is defined. This segme= nt's _base address_ is set to `0xa000`. This means that code memory address= es are _relative_ to address 0xa000; if code being executed is fetched from= address `0x2000`, the _actual_ memory addressed is `0xa000+0x2000=3D0xc000= `. > @@ -886,9 +882,9 @@ struct bootinfo { > > [.programlisting] > .... > -sys/boot/i386/boot2/boot2.c: > +stand/i386/boot2/boot2.c: > __exec((caddr_t)addr, RB_BOOTINFO | (opts & RBX_MASK), > - MAKEBOOTDEV(dev_maj[dsk.type], 0, dsk.slice, dsk.unit, dsk.par= t), > + MAKEBOOTDEV(dev_maj[dsk.type], dsk.slice, dsk.unit, dsk.part), > 0, 0, 0, VTOP(&bootinfo)); > .... > > @@ -901,21 +897,21 @@ The main task for the loader is to boot the kernel.= When the kernel is loaded in > > [.programlisting] > .... > -sys/boot/common/boot.c: > +stand/common/boot.c: > /* Call the exec handler from the loader matching the kernel */ > - module_formats[km->m_loader]->l_exec(km); > + file_formats[fp->f_loader]->l_exec(fp); > .... > > [[boot-kernel]] > =3D=3D Kernel Initialization > > -Let us take a look at the command that links the kernel. This will help = identify the exact location where the loader passes execution to the kernel= . This location is the kernel's actual entry point. > +Let us take a look at the command that links the kernel. This will help = identify the exact location where the loader passes execution to the kernel= . This location is the kernel's actual entry point. This command is now exc= luded from [.filename]#sys/conf/Makefile.i386#. The content that interests = us can be found in [.filename]#/usr/obj/usr/src/i386.i386/sys/GENERIC/#. > > [.programlisting] > .... > -sys/conf/Makefile.i386: > -ld -elf -Bdynamic -T /usr/src/sys/conf/ldscript.i386 -export-dynamic \ > --dynamic-linker /red/herring -o kernel -X locore.o \ > +/usr/obj/usr/src/i386.i386/sys/GENERIC/kernel.meta: > +ld -m elf_i386_fbsd -Bdynamic -T /usr/src/sys/conf/ldscript.i386 --build= -id=3Dsha1 --no-warn-mismatch \ > +--warn-common --export-dynamic --dynamic-linker /red/herring -X -o kern= el locore.o > > .... > > @@ -959,7 +955,7 @@ sys/i386/i386/locore.s: > mov %ax, %gs > .... > > -btext calls the routines `recover_bootinfo()`, `identify_cpu()`, `create= _pagetables()`, which are also defined in [.filename]#locore.s#. Here is a = description of what they do: > +btext calls the routines `recover_bootinfo()`, `identify_cpu()`, which a= re also defined in [.filename]#locore.s#. Here is a description of what the= y do: > > [.informaltable] > [cols=3D"1,1", frame=3D"none"] > @@ -969,29 +965,27 @@ btext calls the routines `recover_bootinfo()`, `ide= ntify_cpu()`, `create_pagetab > |This routine parses the parameters to the kernel passed from the bootst= rap. The kernel may have been booted in 3 ways: by the loader, described ab= ove, by the old disk boot blocks, or by the old diskless boot procedure. Th= is function determines the booting method, and stores the `struct bootinfo`= structure into the kernel memory. > > |`identify_cpu` > -|This functions tries to find out what CPU it is running on, storing the= value found in a variable `_cpu`. > - > -|`create_pagetables` > -|This function allocates and fills out a Page Table Directory at the top= of the kernel memory area. > +|This function tries to find out what CPU it is running on, storing the = value found in a variable `_cpu`. > |=3D=3D=3D > > The next steps are enabling VME, if the CPU supports it: > > [.programlisting] > .... > - testl $CPUID_VME, R(_cpu_feature) > - jz 1f > - movl %cr4, %eax > - orl $CR4_VME, %eax > - movl %eax, %cr4 > +sys/i386/i386/mpboot.s: > + testl $CPUID_VME,%edx > + jz 3f > + orl $CR4_VME,%eax > +3: movl %eax,%cr4 > .... > > Then, enabling paging: > > [.programlisting] > .... > +sys/i386/i386/mpboot.s: > /* Now enable paging */ > - movl R(_IdlePTD), %eax > + movl IdlePTD_nopae, %eax > movl %eax,%cr3 /* load ptd addr into mmu= */ > movl %cr0,%eax /* get control word */ > orl $CR0_PE|CR0_PG,%eax /* enable paging */ > @@ -1002,11 +996,12 @@ The next three lines of code are because the pagin= g was set, so the jump is need > > [.programlisting] > .... > - pushl $begin /* jump to high virtualiz= ed address */ > +sys/i386/i386/mpboot.s: > + pushl $mp_begin /* jump to high m= em */ > ret > > /* now running relocated at KERNBASE where the system is linked to run *= / > -begin: > +mp_begin: /* now running relocated at KERNBASE */ > .... > > The function `init386()` is called with a pointer to the first free phys= ical page, after that `mi_startup()`. `init386` is an architecture dependen= t initialization function, and `mi_startup()` is an architecture independen= t one (the 'mi_' prefix stands for Machine Independent). The kernel never r= eturns from `mi_startup()`, and by calling it, the kernel finishes booting: > @@ -1014,11 +1009,12 @@ The function `init386()` is called with a pointer= to the first free physical pag > [.programlisting] > .... > sys/i386/i386/locore.s: > - movl physfree, %esi > - pushl %esi /* value of first for ini= t386(first) */ > - call _init386 /* wire 386 chip for unix= operation */ > - call _mi_startup /* autoconfiguration, mou= ntroot etc */ > - hlt /* never returns to here */ > + pushl physfree /* value of first for ini= t386(first) */ > + call init386 /* wire 386 chip for unix= operation */ > + addl $4,%esp > + movl %eax,%esp /* Switch to true top of = stack. */ > + call mi_startup /* autoconfiguration, mou= ntroot etc */ > + /* NOTREACHED */ > .... > > =3D=3D=3D `init386()` > @@ -1032,15 +1028,13 @@ sys/i386/i386/locore.s: > * Initialize the DDB, if it is compiled into kernel. > * Initialize the TSS. > * Prepare the LDT. > -* Set up proc0's pcb. > +* Set up thread0's pcb. > > `init386()` initializes the tunable parameters passed from bootstrap by = setting the environment pointer (envp) and calling `init_param1()`. The env= p pointer has been passed from loader in the `bootinfo` structure: > > [.programlisting] > .... > sys/i386/i386/machdep.c: > - kern_envp =3D (caddr_t)bootinfo.bi_envp + KERNBASE; > - > /* Init basic tunables, hz etc */ > init_param1(); > .... > @@ -1050,8 +1044,10 @@ sys/i386/i386/machdep.c: > [.programlisting] > .... > sys/kern/subr_param.c: > - hz =3D HZ; > + hz =3D -1; > TUNABLE_INT_FETCH("kern.hz", &hz); > + if (hz =3D=3D -1) > + hz =3D vm_guest > VM_GUEST_NO ? HZ_VM : HZ; > .... > > TUNABLE__FETCH is used to fetch the value from the environment= : > @@ -1069,30 +1065,36 @@ Then `init386()` prepares the Global Descriptors = Table (GDT). Every task on an x > [.programlisting] > .... > sys/i386/i386/machdep.c: > -union descriptor gdt[NGDT * MAXCPU]; /* global descriptor table */ > +union descriptor gdt0[NGDT]; /* initial global descriptor table */ > +union descriptor *gdt =3D gdt0; /* global descriptor table */ > > -sys/i386/include/segments.h: > +sys/x86/include/segments.h: > /* > * Entries in the Global Descriptor Table (GDT) > */ > #define GNULL_SEL 0 /* Null Descriptor */ > -#define GCODE_SEL 1 /* Kernel Code Descriptor */ > -#define GDATA_SEL 2 /* Kernel Data Descriptor */ > -#define GPRIV_SEL 3 /* SMP Per-Processor Private Data= */ > -#define GPROC0_SEL 4 /* Task state process slot zero a= nd up */ > -#define GLDT_SEL 5 /* LDT - eventually one per proce= ss */ > -#define GUSERLDT_SEL 6 /* User LDT */ > -#define GTGATE_SEL 7 /* Process task switch gate */ > +#define GPRIV_SEL 1 /* SMP Per-Processor Private Data= */ > +#define GUFS_SEL 2 /* User %fs Descriptor (order cri= tical: 1) */ > +#define GUGS_SEL 3 /* User %gs Descriptor (order cri= tical: 2) */ > +#define GCODE_SEL 4 /* Kernel Code Descriptor (order = critical: 1) */ > +#define GDATA_SEL 5 /* Kernel Data Descriptor (order = critical: 2) */ > +#define GUCODE_SEL 6 /* User Code Descriptor (order cr= itical: 3) */ > +#define GUDATA_SEL 7 /* User Data Descriptor (order cr= itical: 4) */ > #define GBIOSLOWMEM_SEL 8 /* BIOS low memory access (must b= e entry 8) */ > -#define GPANIC_SEL 9 /* Task state to consider panic f= rom */ > -#define GBIOSCODE32_SEL 10 /* BIOS interface (32bit Code) */ > -#define GBIOSCODE16_SEL 11 /* BIOS interface (16bit Code) */ > -#define GBIOSDATA_SEL 12 /* BIOS interface (Data) */ > -#define GBIOSUTIL_SEL 13 /* BIOS interface (Utility) */ > -#define GBIOSARGS_SEL 14 /* BIOS interface (Arguments) */ > +#define GPROC0_SEL 9 /* Task state process slot zero a= nd up */ > +#define GLDT_SEL 10 /* Default User LDT */ > +#define GUSERLDT_SEL 11 /* User LDT */ > +#define GPANIC_SEL 12 /* Task state to consider panic f= rom */ > +#define GBIOSCODE32_SEL 13 /* BIOS interface (32bit Code) */ > +#define GBIOSCODE16_SEL 14 /* BIOS interface (16bit Code) */ > +#define GBIOSDATA_SEL 15 /* BIOS interface (Data) */ > +#define GBIOSUTIL_SEL 16 /* BIOS interface (Utility) */ > +#define GBIOSARGS_SEL 17 /* BIOS interface (Arguments) */ > +#define GNDIS_SEL 18 /* For the NDIS layer */ > +#define NGDT 19 > .... > > -Note that those #defines are not selectors themselves, but just a field = INDEX of a selector, so they are exactly the indices of the GDT. for exampl= e, an actual selector for the kernel code (GCODE_SEL) has the value 0x08. > +Note that those #defines are not selectors themselves, but just a field = INDEX of a selector, so they are exactly the indices of the GDT. for exampl= e, an actual selector for the kernel code (GCODE_SEL) has the value 0x20. > > The next step is to initialize the Interrupt Descriptor Table (IDT). Thi= s table is referenced by the processor when a software or hardware interrup= t occurs. For example, to make a system call, user application issues the `= INT 0x80` instruction. This is a software interrupt, so the processor's har= dware looks up a record with index 0x80 in the IDT. This record points to t= he routine that handles this interrupt, in this particular case, this will = be the kernel's syscall gate. The IDT may have a maximum of 256 (0x100) rec= ords. The kernel allocates NIDT records for the IDT, where NIDT is the maxi= mum (256): > > @@ -1108,8 +1110,8 @@ For each interrupt, an appropriate handler is set. = The syscall gate for `INT 0x8 > [.programlisting] > .... > sys/i386/i386/machdep.c: > - setidt(0x80, &IDTVEC(int0x80_syscall), > - SDT_SYS386TGT, SEL_UPL, GSEL(GCODE_SEL, SEL_KPL))= ; > + setidt(IDT_SYSCALL, &IDTVEC(int0x80_syscall), > + SDT_SYS386IGT, SEL_UPL, GSEL(GCODE_SEL, SEL_KPL))= ; > .... > > So when a userland application issues the `INT 0x80` instruction, contro= l will transfer to the function `_Xint0x80_syscall`, which is in the kernel= code segment and will be executed with supervisor privileges. > @@ -1121,10 +1123,10 @@ Console and DDB are then initialized: > sys/i386/i386/machdep.c: > cninit(); > /* skipped */ > -#ifdef DDB > - kdb_init(); > + kdb_init(); > +#ifdef KDB > if (boothowto & RB_KDB) > - Debugger("Boot flags requested debugger"); > + kdb_enter(KDB_WHY_BOOTFLAGS, "Boot flags requested debugg= er"); > #endif > .... > > @@ -1134,25 +1136,27 @@ The Local Descriptors Table is used to reference = userland code and data. Several > > [.programlisting] > .... > -/usr/include/machine/segments.h: > +sys/x86/include/segments.h: > #define LSYS5CALLS_SEL 0 /* forced by intel BCS */ > #define LSYS5SIGR_SEL 1 > -#define L43BSDCALLS_SEL 2 /* notyet */ > #define LUCODE_SEL 3 > -#define LSOL26CALLS_SEL 4 /* Solaris >=3D 2.6 system call g= ate */ > #define LUDATA_SEL 5 > -/* separate stack, es,fs,gs sels ? */ > -/* #define LPOSIXCALLS_SEL 5*/ /* notyet */ > -#define LBSDICALLS_SEL 16 /* BSDI system call gate */ > -#define NLDT (LBSDICALLS_SEL + 1) > +#define NLDT (LUDATA_SEL + 1) > .... > > -Next, proc0's Process Control Block (`struct pcb`) structure is initiali= zed. proc0 is a `struct proc` structure that describes a kernel process. It= is always present while the kernel is running, therefore it is declared as= global: > +Next, proc0's Process Control Block (`struct pcb`) structure is initiali= zed. proc0 is a `struct proc` structure that describes a kernel process. It= is always present while the kernel is running, therefore it is linked with= thread0: > > [.programlisting] > .... > -sys/kern/kern_init.c: > - struct proc proc0; > +sys/i386/i386/machdep.c: > +register_t > +init386(int first) > +{ > + /* ... skipped ... */ > + > + proc_linkup0(&proc0, &thread0); > + /* ... skipped ... */ > +} > .... > > The structure `struct pcb` is a part of a proc structure. It is defined = in [.filename]#/usr/include/machine/pcb.h# and has a process's information = specific to the i386 architecture, such as registers values. > @@ -1164,7 +1168,7 @@ This function performs a bubble sort of all the sys= tem initialization objects an > [.programlisting] > .... > sys/kern/init_main.c: > - for (sipp =3D sysinit; *sipp; sipp++) { > + for (sipp =3D sysinit; sipp < sysinit_end; sipp++) { > > /* ... skipped ... */ > > @@ -1186,10 +1190,11 @@ print_caddr_t(void *data __unused) > { > printf("%s", (char *)data); > } > -SYSINIT(announce, SI_SUB_COPYRIGHT, SI_ORDER_FIRST, print_caddr_t, copyr= ight) > +/* ... skipped ... */ > +SYSINIT(announce, SI_SUB_COPYRIGHT, SI_ORDER_FIRST, print_caddr_t, copyr= ight); > .... > > -The subsystem ID for this object is SI_SUB_COPYRIGHT (0x0800001), which = comes right after the SI_SUB_CONSOLE (0x0800000). So, the copyright message= will be printed out first, just after the console initialization. > +The subsystem ID for this object is SI_SUB_COPYRIGHT (0x0800001). So, th= e copyright message will be printed out first, just after the console initi= alization. > > Let us take a look at what exactly the macro `SYSINIT()` does. It expand= s to a `C_SYSINIT()` macro. The `C_SYSINIT()` macro then expands to a stati= c `struct sysinit` structure declaration with another `DATA_SET` macro call= : > > @@ -1198,91 +1203,62 @@ Let us take a look at what exactly the macro `SYS= INIT()` does. It expands to a ` > /usr/include/sys/kernel.h: > #define C_SYSINIT(uniquifier, subsystem, order, func, ident) \ > static struct sysinit uniquifier ## _sys_init =3D { \ subsystem, \ > - order, \ func, \ ident \ }; \ DATA_SET(sysinit_set,uniquifier ## > + order, \ func, \ (ident) \ }; \ DATA_WSET(sysinit_set,uniquifier #= # > _sys_init); > > #define SYSINIT(uniquifier, subsystem, order, func, ident) \ > C_SYSINIT(uniquifier, subsystem, order, \ > - (sysinit_cfunc_t)(sysinit_nfunc_t)func, (void *)ident) > + (sysinit_cfunc_t)(sysinit_nfunc_t)func, (void *)(ident)) > .... > > -The `DATA_SET()` macro expands to a `MAKE_SET()`, and that macro is the = point where all the sysinit magic is hidden: > +The `DATA_SET()` macro expands to a `_MAKE_SET()`, and that macro is the= point where all the sysinit magic is hidden: > > [.programlisting] > .... > /usr/include/linker_set.h: > -#define MAKE_SET(set, sym) \ > - static void const * const __set_##set##_sym_##sym =3D sym; = \ > - __asm(".section .set." #set ",\"aw\""); \ > - __asm(".long " #sym); \ > - __asm(".previous") > -#endif > -#define TEXT_SET(set, sym) MAKE_SET(set, sym) > -#define DATA_SET(set, sym) MAKE_SET(set, sym) > +#define TEXT_SET(set, sym) _MAKE_SET(set, sym) > +#define DATA_SET(set, sym) _MAKE_SET(set, sym) > .... > > -In our case, the following declaration will occur: > - > -[.programlisting] > -.... > -static struct sysinit announce_sys_init =3D { > - SI_SUB_COPYRIGHT, > - SI_ORDER_FIRST, > - (sysinit_cfunc_t)(sysinit_nfunc_t) print_caddr_t, > - (void *) copyright > -}; > - > -static void const *const __set_sysinit_set_sym_announce_sys_init =3D > - announce_sys_init; > -__asm(".section .set.sysinit_set" ",\"aw\""); > -__asm(".long " "announce_sys_init"); > -__asm(".previous"); > -.... > - > -The first `__asm` instruction will create an ELF section within the kern= el's executable. This will happen at kernel link time. The section will hav= e the name `.set.sysinit_set`. The content of this section is one 32-bit va= lue, the address of announce_sys_init structure, and that is what the secon= d `__asm` is. The third `__asm` instruction marks the end of a section. If = a directive with the same section name occurred before, the content, i.e., = the 32-bit value, will be appended to the existing section, so forming an a= rray of 32-bit pointers. > - > +After executing these macros, various sections were made in the kernel, = including`set.sysinit_set`. > Running objdump on a kernel binary, you may notice the presence of such = small sections: > > [source,bash] > .... > -% objdump -h /kernel > - 7 .set.cons_set 00000014 c03164c0 c03164c0 002154c0 2**2 > - CONTENTS, ALLOC, LOAD, DATA > - 8 .set.kbddriver_set 00000010 c03164d4 c03164d4 002154d4 2**2 > - CONTENTS, ALLOC, LOAD, DATA > - 9 .set.scrndr_set 00000024 c03164e4 c03164e4 002154e4 2**2 > - CONTENTS, ALLOC, LOAD, DATA > - 10 .set.scterm_set 0000000c c0316508 c0316508 00215508 2**2 > - CONTENTS, ALLOC, LOAD, DATA > - 11 .set.sysctl_set 0000097c c0316514 c0316514 00215514 2**2 > - CONTENTS, ALLOC, LOAD, DATA > - 12 .set.sysinit_set 00000664 c0316e90 c0316e90 00215e90 2**2 > - CONTENTS, ALLOC, LOAD, DATA > +% llvm-objdump -h /kernel > +Sections: > +Idx Name Size VMA Type > *** 126 LINES SKIPPED *** Thanks for this upgrade!!!