[RFC] Kernel shared variables

Giovanni Trematerra giovanni.trematerra at gmail.com
Fri Jun 1 22:23:43 UTC 2012


On Fri, Jun 1, 2012 at 11:21 PM, Bruce Evans <brde at optusnet.com.au> wrote:
> On Fri, 1 Jun 2012, Giovanni Trematerra wrote:
>
>> I'd like to discuss a way to provide a mechanism to share some read-only
>> data between kernel and user space programs avoiding syscall overhead,
>> implementing some them, such as gettimeofday(3) and time(3) as ordinary
>> user space routine.
>
>
> This is particularly unsuitable for implementing gettimeofday(), since for
> it to work you would need to use approximately 1 CPU spinning in the
> kernel to update the time every microsecond.  For time(3), it only needs
> a relatively slow update.  For clock_gettime() with nansoeconds precision,
> it is even more unsuitable.  For clock_gettime() with precisions between
> 1 second and 1 microseconds, it is intermediately unsuitable.
>
> It also requires some complications for locking/atomicity and coherency
> (much the same as in the kernel.  Not just for times.  For times, the
> kernel handles locking/atomicity fairly well, and coherency fairly badly.
>

Well, the primary intend of the patch is to provide a mechanism to share data
between kernel and user land without switching in kernel mode. Not to provide
a complete re-implementation in user mode of all time stuff.

>
>> The patch at
>> http://www.trematerra.net/patches/ksvar_experimental.patch
>>
>> is in a very experimental stage. It's just a proof-of-concept.
>> Only works for an AMD64 kernel and only for 64-bit applications.
>> The idea is to have all the variables that we want to share between kernel
>> and user space into one or more consecutive pages of memory that will be
>> mapped read-only into every running process. At the start of the first
>> shared page
>> there'll be a table with as many entries as the number of the shared
>> variables.
>> Each entry is a 32-bit value that is the offset between the start of the
>> shared
>> page and the start of the variable in the page. The user space processes
>> need
>> to find out the map address of shared page and use the table to access to
>> the
>> shared variables.
>
>
> On amd64, 2 32-bit values or 64-bit values with most bits 0 or 1 can be
> packed/encoded into 1 64-bit value to give a certain atomicity without
> locking.  The corresponding i386 packing into 1 32-bit value doesn't work
> so well.

These values are written just one time during a SYSINIT routine and are only
read by user processes.

>
>> ...
>
>
>> Just as proof of concept I re-implemented gettimeofday(3) in user space.
>> First of all I didn't remove the entry into the syscall.master, just
>> renamed the
>> sys_gettimeofday. I need it for the fallback path.
>> In the kernel I introduced a struct wall_clock.
>>
>> +struct wall_clock
>> +{
>> +       struct timeval  tv;
>> +       struct timezone tz;
>> +};
>
>
> This is much larger than 64 bits.  struct timezone is relatively
> unimportant.
> struct timeval is bloated on amd64 (128 bits), but can be packed into 64
> bits (works for a few hundred years).  On i386, it could be packed into
> 20 bits for tv_usec and 12 bits for an offset for tv_sec.
>

Thanks a lot for your explanation. I think they will be precious as a reference.
Nonetheless I just wrote gettimeofday in that way just as proof-of-concept,
just to show how things could be supposed to work, it didn't mean to be correct.
I think it was just unfortunate to have choose gettimeofday.
I'm most interested in the VM things of the patch.

--
Gianni


More information about the freebsd-arch mailing list