debugging frequent kernel panics on 8.2-RELEASE

Kostik Belousov kostikbel at gmail.com
Wed Aug 17 21:29:51 UTC 2011


On Wed, Aug 17, 2011 at 11:21:42PM +0300, Andriy Gapon wrote:
[skip]

> But I also would like to use this opportunity to discuss how we can
> make it easier to debug such issue as this. I think that this problem
> demonstrates that when we treat certain junk in kernel address value
> as a userland address value, we throw additional heaps of irrelevant
> stuff on top of an actual problem. One solution could be to use a
> special flag that would mark all actual attempts to access userland
> address (e.g. setting the flag on entrance to copyin and clearing it
> upon return), so that in the page fault handler we could distinguish
> actual faults on userland addresses from faults on garbage kernel
> addresses. I am sure that there could be other clever techniques to
> catch such garbage addresses early.

We already have such mechanism, the kernel code aware of the usermode
page access sets pcb_onfault. See the end of trap_pfault() handler.
In fact, we can catch it earlier, before even calling vm_fault().

BTW, I think this is esp. useful in the combination with the support
for the SMEP in recent Intel CPUs.

commit 2e1b36fa93f9499e37acf04a66ff0646d4f13536
Author: Konstantin Belousov <kostik at pooma.home>
Date:   Thu Aug 18 00:08:50 2011 +0300

    Assert that the exiting process does not return to usermode.
    On x86, do not call vm_fault() when the kernel is not prepared
    to handle unsuccessful page fault.

diff --git a/sys/amd64/amd64/trap.c b/sys/amd64/amd64/trap.c
index 4e5f8b8..55e1e5a 100644
--- a/sys/amd64/amd64/trap.c
+++ b/sys/amd64/amd64/trap.c
@@ -674,6 +674,19 @@ trap_pfault(frame, usermode)
 			goto nogo;
 
 		map = &vm->vm_map;
+
+		/*
+		 * When accessing a usermode address, kernel must be
+		 * ready to accept the page fault, and provide a
+		 * handling routine.  Since accessing the address
+		 * without the handler is a bug, do not try to handle
+		 * it normally, and panic immediately.
+		 */
+		if (!usermode && (td->td_intr_nesting_level != 0 ||
+		    PCPU_GET(curpcb)->pcb_onfault == NULL)) {
+			trap_fatal(frame, eva);
+			return (-1);
+		}
 	}
 
 	/*
diff --git a/sys/i386/i386/trap.c b/sys/i386/i386/trap.c
index 5a8016c..e6d2b5a 100644
--- a/sys/i386/i386/trap.c
+++ b/sys/i386/i386/trap.c
@@ -831,6 +831,11 @@ trap_pfault(frame, usermode, eva)
 			goto nogo;
 
 		map = &vm->vm_map;
+		if (!usermode && (td->td_intr_nesting_level != 0 ||
+		    PCPU_GET(curpcb)->pcb_onfault == NULL)) {
+			trap_fatal(frame, eva);
+			return (-1);
+		}
 	}
 
 	/*
diff --git a/sys/kern/subr_trap.c b/sys/kern/subr_trap.c
index 3527ed1..a69b7b8 100644
--- a/sys/kern/subr_trap.c
+++ b/sys/kern/subr_trap.c
@@ -99,6 +99,8 @@ userret(struct thread *td, struct trapframe *frame)
 
 	CTR3(KTR_SYSC, "userret: thread %p (pid %d, %s)", td, p->p_pid,
             td->td_name);
+	KASSERT((p->p_flag & P_WEXIT) == 0,
+	    ("Exiting process returns to usermode"));
 #if 0
 #ifdef DIAGNOSTIC
 	/* Check that we called signotify() enough. */
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-jail/attachments/20110817/c7c8e305/attachment-0001.pgp


More information about the freebsd-jail mailing list