new vm_pager_get_pages() KPI, round 3

Gleb Smirnoff glebius at FreeBSD.org
Mon Dec 14 17:50:58 UTC 2015


On Mon, Dec 14, 2015 at 01:13:35PM +0200, Konstantin Belousov wrote:
K> I fail to understand how the case of count > 1 and non-contiguous blocks
K> in the non-readahead case is handled by new vnode_pager_generic_getpages().
K> 
K> I do not understand how a hole somewhere in the requested range is handled.
K> Code has a comment that a hole must not appear in the range.
K> 
K> Both issues mean that vm_pager_has_page() still must be called before
K> pagein, for count > 1 use.  E.g. the exec_map_first_page() uses *after
K> value returned from has_pages() to calculate count, which is an advisory
K> and not the contract.  Same issue prevents converting GEM and TTM (and
K> probably md) to use the count > 1 KPI.

The *after and *before are now not advisory, but a contract. Those consumers,
who want to utilize count > 1, must preceed the call to vm_pager_get_pages()
with call to vm_pager_haspage(). Only region approved by vm_pager_haspage()
or smaller will succeed.

K> Same is true for swap pager, and this prevents the removal of the loop
K> in vm_thread_swaping().
K> 
K> Code assumes that the partially valid page may only appear in the last
K> position of the page run for the local pager, which again requires
K> pre-validation of the vm_pager_get_pages() on the caller side.

Yes, asking for page in into a valid page is a risk of data corruption.

K> Overall this is not an KPI that was discussed. It seemingly does not
K> change semantic for count == 1 case, but is not what it should be for
K> count > 1. As discussed, new vm_pager_get_pages() was support to just
K> work for any count, doing the loop over the non-contig ranges or short
K> reads, and guaranteeing that all existing (or hole-filled) pages are
K> read until EOF is met. This KPI was supposed to:
K> - fix my compaints about short reads

I will not take your complaints about short reads. The get pages KPI
is not a complement to VOP_READ(), neither of a read(2) syscall. If
a underlying filesystem has problems in it, it must deal with these
problems on its own, doing multiple I/Os per VOP_GETPAGES().

K> - avoid excessive VOP_BMAP() call from has_pages before get_pages()

It is now avoided for count == 1.

K> - allowed to remove the loops from all current get_pages() consumers,
K>   it vm_thread_swapin(), GEM/TTM, image activator

This wasn't discussed at all. I like this idea, that can be done later.

-- 
Totus tuus, Glebius.


More information about the freebsd-arch mailing list