svn commit: r230782 - head/sys/kern
John Baldwin
jhb at freebsd.org
Mon Jan 30 20:12:31 UTC 2012
On Monday, January 30, 2012 2:35:16 pm John Baldwin wrote:
> Author: jhb
> Date: Mon Jan 30 19:35:15 2012
> New Revision: 230782
> URL: http://svn.freebsd.org/changeset/base/230782
>
> Log:
> Refine the implementation of POSIX_FADV_NOREUSE for the read(2) case such
> that instead of using direct I/O it allows read-ahead similar to
> POSIX_FADV_NORMAL, but invokes VOP_ADVISE(POSIX_FADV_DONTNEED) after the
> read(2) has completed to purge just-read data. The write(2) path continues
> to use direct I/O for POSIX_FADV_NOREUSE for now. Note that NOREUSE works
> optimally if an application reads and writes full fs blocks.
Oops, forgot:
Tested by: jilles
The NOREUSE bits may still need further refinement. For example, if we allow
something along the lines of 'POSIX_FADV_NOREUSE | POSIX_FADV_SEQUENTIAL',
then we could change the VOP_ADVISE() here to use 0 as the starting offset
which should do a better job of not leaving data in RAM due to reading partial
blocks. Also, sequentially reading a file on unaligned block offsets with
NOREUSE can result in extraneous reads currently, and we could possibly alleviate
those by changing DONTNEED to only flush wholly contained-blocks rather than
wholly-contained pages from the backing VM object. However, without the
previous change I suggested that will exacerbate the problem of NOREUSE not
actually purging any data from RAM. The problem with the | approach though is
that it is not portable, so it is not likely that portable programs like vlc
will use it. HP/UX had an extended variant of fadvise() that allowed multiple
policies to be set on a range, apparently to handle exactly this case
(sequential and noreuse). The problem seems to be that noreuse is really
orthogonal to the other access-pattern hints (normal vs random vs sequential).
Finally, I've wondered if POSIX_FADV_SEQUENTIAL shouldn't just mandate the
maximum read-ahead and write-clustering rather than using the heuristics.
It's not completely clear if we did that what the "right" thing to do if an
application does posix_fadvise(POSIX_FADV_SEQUENTIAL) followed by
fcntl(F_READAHEAD) with a different size, esp. given that posix_fadvise()
can theoretically only apply to a range of the file descriptor whereas
F_READAHEAD applies globally to the file descriptor.
--
John Baldwin
More information about the svn-src-all
mailing list