ULE locking mechanism

Jens Krieg jkrieg at mailbox.tu-berlin.de
Tue Jan 28 13:27:30 UTC 2014


Hello,

we are currently working on project for our university. Our goal is to implement a simple round robin scheduler for FreeBSD 9.2 on a single core machine.
So far we removed most of the functionality of the ULE scheduler except the functions that are called from outside. The system successfully boots to user land with our RR scheduler managing thread in a list based run queue. Further, it is possible to interact with the system using the shell.

The next step is to replace the locking mechanism of the ULE scheduler. Therefore, we replaced the scheduling dependent thread_lock/thread_unlock functions by simply disabling/enabling the interrupts. With this modification the kernel works fine until we hit the user land then the system crashes.
The error occurs in the init user process (init_main.c:start_init:685). We found out that the page fault is triggered while executing the subyte function for the first time. See the error description below (unfortunately not shown in backtrace).
We compared the ULE scheduler with our RR implementation and it appears, that the parameters passed to subyte as well as the register values are identical. We assume, that whatever caused the error is related to the thread locking replacement.

Every time the kernel want to modify thread data the corresponding thread is locked to prevent any interference by other threads. Since we are using a single core machine why isn’t it sufficient to simply disable interrupt while modifying thread data. Could you provide us with detailed information about the locking mechanism in FreeBSD and also answer the following questions, please.

What is the purpose of thread_lock/thread_unlock besides protecting thread data?
How does the TDQ LOCK works and how is it related to a thread LOCK?
	- all thread LOCKs of the thread located in the run queue pointing to the TDQ LOCK, and
	- the TDQ LOCK points to the currently running thread
	- on context switching the current thread passes the TDQ LOCK to the new chosen thread
	- Could you explain the idea behind that locking concept, please? 
Any suggestions we shall care about in our own lock implementation?


Kind regards,
Jens Krieg



start_init: trying /sbin/init

Fatal trap 12: page fault while in kernel mode
fault virtual address	= 0x7fffffffefff
fault code			= supervisor write data, page not present
instruction pointer	= 0x20:0xffffffff808ab119
stack pointer	        = 0x28:0xffffff800020db30
frame pointer	        = 0x28:0xffffff800020dbe0
code segment		= base 0x0, limit 0xfffff, type 0x1b
				= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process	= 1 (kernel)
trap number		= 12
panic: page fault
KDB: stack backtrace:
#0 0xffffffff806e19cf at kdb_backtrace+0x5f
#1 0xffffffff806b2ddb at panic+0x15b
#2 0xffffffff808ac797 at trap_fatal+0x267
#3 0xffffffff808accfc at trap_pfault+0x40c
#4 0xffffffff808ad0ca at trap+0x37a
#5 0xffffffff8089839f at calltrap+0x8
#6 0xffffffff80687c4d at fork_exit+0x9d
#7 0xffffffff808988ce at fork_trampoline+0xe



More information about the freebsd-hackers mailing list