Prefaulting for i/o buffers
Attilio Rao
attilio at freebsd.org
Fri Feb 3 20:10:21 UTC 2012
2012/2/3 Konstantin Belousov <kostikbel at gmail.com>:
> FreeBSD I/O infrastructure has well known issue with deadlock caused
> by vnode lock order reversal when buffers supplied to read(2) or
> write(2) syscalls are backed by mmaped file.
>
> I previously published the patches to convert i/o path to use VMIO,
> based on the Jeff Roberson proposal, see
> http://wiki.freebsd.org/VM6. As a side effect, the VM6 fixed the
> deadlock. Since that work is very intrusive and did not got any
> follow-up, it get stalled.
>
> Below is very lightweight patch which only goal is to fix deadlock in
> the least intrusive way. This is possible after FreeBSD got the
> vm_fault_quick_hold_pages(9) and vm_fault_disable_pagefaults(9) KPIs.
> http://people.freebsd.org/~kib/misc/vm1.3.patch
>
> Theory of operation is described in the patched sys/kern/vfs_vnops.c,
> see preamble comment for vn_io_fault(). The patch borrows the
> rangelocks implementation from VM6, which was discussed and improved
> together with Attilio Rao.
>
> I was not able to reproduce the deadlock in the targeted test running
> for several hours, while stock HEAD deadlocks in the first iteration.
>
> Below is the benchmark for the worst-case situation for the patched
> system, reading 1 byte from a file in a loop. The value is the time in
> seconds to execute read(2) for single byte and lseek back to the start
> of the file. The loop is executed 100,000,000 times. Machine has
> 3.4Ghz Core i7 2600K and used HEAD at 230866 with debugging options
> turned off.
>
> As you see, the rangelock overhead for the worst (but uncontented)
> case is less then 10%.
>
> x stock-1-byte.txt
> + vm1-1-byte.txt
> +--------------------------------------------------------------------------+
> |xx ++|
> |xxx +++|
> ||A |A||
> +--------------------------------------------------------------------------+
> N Min Max Median Avg Stddev
> x 5 1.063206e-06 1.065569e-06 1.064172e-06 1.064109e-06 9.8031959e-10
> + 5 1.167145e-06 1.170244e-06 1.168939e-06 1.1690444e-06 1.2477022e-09
> Difference at 95.0% confidence
> 1.04935e-07 +/- 1.63638e-09
> 9.86134% +/- 0.153779%
> (Student's t, pooled s = 1.122e-09)
Do you have an ETA for reviews? When do you plan to commit this?
it would be valuable to get a grasp on the benchmark and refine the
performance difference as much as possible.
Thanks,
Attilio
--
Peace can only be achieved by understanding - A. Einstein
More information about the freebsd-arch
mailing list