Before & After Under The Giant Lock

Sun Nov 25 03:13:30 PST 2007

On Sun, 25 Nov 2007, binto wrote:

> From what I read in "The Design and Implementation of the FreeBSD Operating 
> System",said:
>
> 'However, most of the heavily used parts of the kernel have been moved out 
> from under the giant lock, including much of the virtual memory system, the 
> networking stack, and the filesystem.'
>
> What the different "virtual memory system, the networking stack, and the 
> filesystem." before under giant lock & after moved out from under giant 
> lock?
>
> I'm interest get deeper learn operating system, especially with FreeBSD..

Binto,

Most currently available operating systems began life on uniprocessor 
hardware, and therefore started out with a kernel synchronization model 
intended to address concurrency generated by interrupt handlers, sleeping on 
I/O, etc, but not true parallelism.  Typically, that synchronization model has 
involved "disabling interrupts", perhaps with "interrupt levels" to allow 
prioritization and selective preemption, and long with simple sleep locks 
intended to synchronize acticities such as I/O in which kernel thread sleeping 
may take place.  And, as you might guess, that's where BSD, and hence FreeBSD, 
started out.

So the first step in introducing SMP support into an operating system is often 
to introduce a "Giant lock" around the the majority of the kernel, allowing 
the kernel to effectively run on only one CPU at a time.  The intent there is 
to restore the assumptions of the UP kernel despite running on SMP hardware. 
This allows user programs to run on multiple CPUs at the same time, but 
prevents kernel parallelism.  This is relatively easy to introduce in a 
kernel, as it doesn't require changing the synchronization model for the 
entire kernel, just adding the Giant lock, modifying the probing/boot code, 
dealing with interrupt forwarding, dealing with TLB shootdowns, etc.  However, 
you don't get any parallelism win for the kernel at all, so if you have 
kernel-intensive workloads, you've gained nothing but overhead.

So the next stage in SMP support is to start to modify the kernel 
synchronization model so that parts of the kernel can start to run in parallel 
on multiple CPUs, ideally leading to speedup.  For FreeBSD, the "Giant lock" 
was introduced in FreeBSD 3, and then we started to break down that lock in 
FreeBSD 5.  In FreeBSD 6, the Giant lock is gone from most of the kernel most 
of the time, and in FreeBSD 7, it's far more gone.  There are still some edge 
cases where Giant is present -- less commonly used file systems, some older 
device drivers, etc, but almost all of the time when in the steady state, 
you're not seeing seeing Giant-protected code running.  It's worth noting that 
if you take 1/2 the kernel out from under Giant, you've improved the 
performance of the Giant-protected code as well, since it has less other code 
to contend with.

At this point, Giant is gradually becoming a lock around the tty, newbus, usb, 
and msdosfs code, and we're largely at diminishing returns in terms of making 
improvements in parallelism through removing Giant.  In FreeBSD 7, the focus 
was on improving parallelism rather than removing Giant, with improvements in 
locking primitives, the scheduler, and lock granularity.  For example, most of 
the improvement in MySQL performance in FreeBSD 7 can be put down to a small 
number of changes:

- Conversion to 1:1 threads from M:N threads.

- Massive efficiency improvemnts in the sx(9) sleep locking primitive.

- Introduction of an efficient non-sleeping rw(9) locking primitive.

- Conversion of the kernel file descriptor table lock to a lower overhead
   sx(9) primitive, as well as efficiency improvements through redoing the
   locking to distinguish read and write locking.

- Move to fine-grained locking in UNIX domain sockets.

- Significant scalability improvements in scheduling due to introducing
   the ule(4) scheduler.

In FreeBSD 8, I expect we'll see a continued focus on both locking granularity 
and improving opportunities for kernel parallelism by better distributing 
workloads over CPU pools.  This is important because the number of cores/chip 
is continuing to increase dramatically, so MP performance is going to be 
important to keep working on.  That said, the results to date have been 
extremely promising, and I anticipate that we will continue to find ways to 
better exploit multiprocessor hardware, especially in the network stack.

Robert N M Watson
Computer Laboratory
University of Cambridge