Sysunit: Unit testing for the FreeBSD kernel

From: Ryan Stone <rysto32_at_gmail.com>
Date: Thu, 06 Jan 2022 22:26:52 UTC
Hello everyone,

For a while I've had a project on the back-burner that aims to allow
developers to write unit tests for their FreeBSD kernel code.  I've
named this project Sysunit.  I've been able to make a bunch of
progress on this in the past few weeks and have a small test suite in
a branch off of main.  The test is able to compile, link and
successfully tests a kernel file (I even found at least two latent
bugs in it!).  I think that I'm at the point where I think it's ready
for wider discussion.

The goal is to enable developers to take small units of kernel C code
(generally one or two source files), compile those files as userland
objects, and then link them against a test harness that's written to
exhaustively test the unit.  This test will be a normal userland
application with no special dependencies on the kernel.  The kernel
code being tested will be running completely in userland.

I expect that unit testing should bring us a number of benefits.  As
the tests are simple userland executables with no kernel dependencies,
the testing cycle is reduced to simply re-compiling the executable to
test kernel changes.  This will be significantly faster and easier
than having to boot and test a real kernel in a VM or on real
hardware.  Also, any debugging facilities available in userland can of
course be used on our sysunit tests, including gdb/lldb, valgrind and
LLVM sanitizers.

Unit testing will also make it significantly easier to perform fault
injection and test error paths.  As the kernel source is being run in
a synthetic environment, it's easy enough for a unit test to (for
example) report a fake error from a fake disk and confirm that this is
handled correctly by the kernel code under test.

And of course there are a number of other ancillary benefits of unit
tests, but this email has already gotten pretty long and I haven't
gotten to the meat yet, so I'll spare you all the full sales pitch.

In order to allow for our kernel code to be compiled and run into
userland tests, I've worked on 5 major activities:

1. Modify kernel headers to expose KPIs to unit tests
2. Implement test doubles of those KPIs (i.e. implement userland
versions of the KPIs in ways that are useful to tests)
3. Write Pktgen, a library for programmatic creation and verification of packets
4. Add make infrastructure for building the test double libraries and
unit tests themselves
5. Write a unit test of the TCP LRO subsystem, to prove all of this works

My WIP in progress can be found in my git branch.  If you clone this,
beware, as I do rebase my branch frequently.

 https://github.com/rysto32/freebsd/commits/sysunit

1. Kernel Headers

My unit tests are compiled for userland, but they need to access
definitions of kernel types and prototypes of KPIs.  Unfortunately,
it's not as simple as defining _KERNEL and including the headers, as
several critical headers do not expose important userland definitions
like errno is _KERNEL is defined.  To resolve this conundrum, I've
introduced a new _KERNEL_UT (read as "kernel unit test") macro.  If
this macro is defined then private kernel definitions will be exposed
by the headers that I've modified.  Unfortunately, there are a few
KPIs that conflict with userland symbols, like malloc(9).  These
conflicting symbols will not be exposed.

You can see an example of this entails in this commit:

https://github.com/rysto32/freebsd/commit/60704dfc6bf19be879e6006493e7acd5c0f1020e

I've also had to make a few other minor modifications.  For example,
my tests are written using Google Test, which is a C++ framework, so I
had to replace any usages of C++ keywords as identifiers.  There was
also one unhappy case where I had to appease the C++ type checker by
adding cast from void* in sys/mbuf.h.

Note that I haven't converted every kernel header in this way.  I've
only converted the headers necessary to get my test compiling.
Additional headers can be faulted-in and converted as more tests
covering other areas of the kernel are written.

2. KPI Test Doubles

Our kernel code depends on a huge number of KPIs (kernel APIs).  Stuff
like mbufs, UMA, mutex, etc. In order to compile and run kernel code
in a userland test, we require userland implementations of these KPIs
that can be linked into the test.  I've implemented test doubles for
all of the KPIs depended on, either directly or indirectly, by LRO.

Implementing new test doubles is the lion's share of the work in
getting a new unit test up and running.  My hope is that as we add
more tests, new tests can reuse test doubles that have already been
implemented.  Hopefully we reach a point where there is significantly
less overhead for writing new tests, as that is definitely a big
barrier to writing new tests right now.

The test doubles that I've implemented so far can be found beneath
this directory on my git branch:

https://github.com/rysto32/freebsd/tree/sysunit/lib/sysunit

3. Pktgen

To test LRO, I had to send a variety of different packets at it to
test how it reacted.  I wound up writing a library for generating test
packets.  The basic approach is that you can define at compile time a
packet template describing the contents of a packet.  The template can
contain any number of headers followed by a payload.  Templates are
read-only, but you can apply mutating operations to a template to
generate a new one that has all of the same field values as the
original template, with the exception of the fields modified by the
mutators.

With a template you can either generate an mbuf containing the packet
specified by the template, or test that an mbuf's contents matches the
template.

I'm not sure how generally useful this library will be but I
definitely found it to be a useful tool for concisely specifying the
packets I was testing with in my LRO test.

Pktgen's implementation can be found here.  Note that this makes heavy
use of C++17's template features.

https://github.com/rysto32/freebsd/tree/sysunit/lib/sysunit/pktgen

4. Make infrastructure

I've added a new bsd.sysunit.mk file.  This will set build variables
like CFLAGS appropriately for building test double libraries and
sysunit tests (e.g. it will add -D_KERNEL_UT to address the header
file issue already discussed).  This also adds support for compiling
kernel files into userland objects.  Kernel files require special
compiler flags on top of the normal sysunit flags, so my new KSRCS
variable handles this.

This can be seen here:
https://github.com/rysto32/freebsd/blob/sysunit/share/mk/bsd.sysunit.mk

5. Unit tests for TCP LRO

I choose to write unit tests for LRO for a few reasons.  First, LRO is
a relatively self-contained piece of code, so I hoped that this meant
it would have minimal dependencies on KPIs, and therefore a lot fewer
test doubles for me to write up-front.  This honestly didn't work out
as well as I'd hoped.

I also chose it because it was a piece of code that I had enough
familiarity with that I felt I could make a decent cut at some unit
tests.  Finally, at the time that I started the project an interesting
bug had been fixed in it and I thought that writing a unit test that
could easily demonstrate the bug would make for an interesting talking
point.

I've written two test suites.  The first is a collection of 3 sample
tests that demonstrate some simple unit tests.  I've commented the
tests extensively to allow people with no familiarity with these test
APIs to follow the logic:

https://github.com/rysto32/freebsd/blob/sysunit/tests/sys/sysunit/tcp_lro_sample.cc

I've also written a more complicated set of tests that try to test LRO
more extensively.  These tests use some more advanced gtest features
to allow a single set of tests to be run in both IPv4 mode and IPv6
mode:

https://github.com/rysto32/freebsd/blob/sysunit/tests/sys/sysunit/tcp_lro_test.cc

Just for fun, these two tests will currently fail on main due to bugs
in LRO.  I have fixes in my branch for these and I'll submit them Phab
soon:

https://github.com/rysto32/freebsd/blob/sysunit/tests/sys/sysunit/tcp_lro_test.cc#L381
https://github.com/rysto32/freebsd/blob/sysunit/tests/sys/sysunit/tcp_lro_test.cc#L573