Per-mount syncer threads and fanout for pagedaemon cleaning
mdf at FreeBSD.org
mdf at FreeBSD.org
Tue Dec 27 16:59:43 UTC 2011
On Tue, Dec 27, 2011 at 8:05 AM, Attilio Rao <attilio at freebsd.org> wrote:
> 2011/12/27 Giovanni Trematerra <giovanni.trematerra at gmail.com>:
>> On Mon, Dec 26, 2011 at 9:24 PM, Venkatesh Srinivas
>> <vsrinivas at dragonflybsd.org> wrote:
>>> I've been playing with two things in DragonFly that might be of interest
>>> Thing #1 :=
>>> First, per-mountpoint syncer threads. Currently there is a single thread,
>>> 'syncer', which periodically calls fsync() on dirty vnodes from every mount,
>>> along with calling vfs_sync() on each filesystem itself (via syncer vnodes).
>>> My patch modifies this to create syncer threads for mounts that request it.
>>> For these mounts, vnodes are synced from their mount-specific thread rather
>>> than the global syncer.
>>> The idea is that periodic fsync/sync operations from one filesystem should
>>> stall or delay synchronization for other ones.
>>> The patch was fairly simple:
>> There's something WIP by attilio@ on that area.
>> you might want to take a look at
>> I don't know what hammerfs needs but UFS/FFS and buffer cache make a good
>> job performance-wise and so the authors are skeptical about the boost that such
>> a change can give. We believe that brain cycles need to be spent on
>> other pieces of the system such as ARC and ZFS.
> More specifically, it is likely that focusing on UFS and buffer cache
> for performance is not really useful, we should drive our efforts over
> ARC and ZFS.
> Also, the real bottlenecks in our I/O paths are in GEOM
> single-threaded design, lack of unmapped I/O functionality, possibly
> lack of proritized I/O, etc.
Indeed, Isilon (and probably other vendors as well) entirely skip
VFS_SYNC when the WAIT argument is MNT_LAZY. Since we're a
distributed journalled filesystem, syncing via a system thread is not
a relevant operation; i.e. all writes that have exited a VOP_WRITE or
similar operation are already in reasonably stable storage in a
journal on the relevant nodes.
However, we do then have our own threads running on each node to flush
the journal regularly (in addition to when it fills up), and I don't
know enough about this to know if it could be fit into the syncer
thread idea or if it's too tied in somehow to our architecture.
More information about the freebsd-hackers