Prefaulting for i/o buffers

Attilio Rao attilio at freebsd.org
Fri Feb 3 20:10:21 UTC 2012


2012/2/3 Konstantin Belousov <kostikbel at gmail.com>:
> FreeBSD I/O infrastructure has well known issue with deadlock caused
> by vnode lock order reversal when buffers supplied to read(2) or
> write(2) syscalls are backed by mmaped file.
>
> I previously published the patches to convert i/o path to use VMIO,
> based on the Jeff Roberson proposal, see
> http://wiki.freebsd.org/VM6. As a side effect, the VM6 fixed the
> deadlock. Since that work is very intrusive and did not got any
> follow-up, it get stalled.
>
> Below is very lightweight patch which only goal is to fix deadlock in
> the least intrusive way. This is possible after FreeBSD got the
> vm_fault_quick_hold_pages(9) and vm_fault_disable_pagefaults(9) KPIs.
> http://people.freebsd.org/~kib/misc/vm1.3.patch
>
> Theory of operation is described in the patched sys/kern/vfs_vnops.c,
> see preamble comment for vn_io_fault(). The patch borrows the
> rangelocks implementation from VM6, which was discussed and improved
> together with Attilio Rao.
>
> I was not able to reproduce the deadlock in the targeted test running
> for several hours, while stock HEAD deadlocks in the first iteration.
>
> Below is the benchmark for the worst-case situation for the patched
> system, reading 1 byte from a file in a loop. The value is the time in
> seconds to execute read(2) for single byte and lseek back to the start
> of the file. The loop is executed 100,000,000 times. Machine has
> 3.4Ghz Core i7 2600K and used HEAD at 230866 with debugging options
> turned off.
>
> As you see, the rangelock overhead for the worst (but uncontented)
> case is less then 10%.
>
> x stock-1-byte.txt
> + vm1-1-byte.txt
> +--------------------------------------------------------------------------+
> |xx                                                                      ++|
> |xxx                                                                    +++|
> ||A                                                                     |A||
> +--------------------------------------------------------------------------+
>    N           Min           Max        Median           Avg        Stddev
> x   5  1.063206e-06  1.065569e-06  1.064172e-06  1.064109e-06 9.8031959e-10
> +   5  1.167145e-06  1.170244e-06  1.168939e-06 1.1690444e-06 1.2477022e-09
> Difference at 95.0% confidence
>        1.04935e-07 +/- 1.63638e-09
>        9.86134% +/- 0.153779%
>        (Student's t, pooled s = 1.122e-09)

Do you have an ETA for reviews? When do you plan to commit this?
it would be valuable to get a grasp on the benchmark and refine the
performance difference as much as possible.

Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein


More information about the freebsd-arch mailing list