Tmpfs elimination of double-copy

Kostik Belousov kostikbel at gmail.com
Mon Jun 21 18:49:34 UTC 2010


On Mon, Jun 21, 2010 at 10:30:55AM -0400, John Baldwin wrote:
> On Monday 21 June 2010 8:58:25 am Kostik Belousov wrote:
> > Hi,
> > Below is the patch that eliminates second copy of the data kept by tmpfs
> > in case a file is mapped. Also, it removes potential deadlocks due to
> > tmpfs doing copyin/out while page is busy. It is possible that patch
> > also fixes known issue with sendfile(2) of tmpfs file, but I did not
> > verified this.
> > 
> > Patch essentially consists of three parts:
> > - move of vm_object' vnp_size from the type-discriminated union to the
> >   vm_object proper;
> > - making vm not choke when vm object held in the struct vnode' v_object
> >   is default or swap object instead of vnode object;
> > - use of the swap object that keeps data for tmpfs VREG file, also as
> >   v_object.
> > 
> > Peter Holm helped me with the patch, apparently we survive fsx and stress2.
> 
> Why did you have to move vnp_size out of the union?  Is tmpfs using a non-
> OBJT_VNODE object to hold file data?
Tmpfs uses OBJT_SWAP object to keep the data pages for the files.
Current code allocates another object of type OBJT_VNODE, assigned
to vp->v_object, to satisfy VM interface for mapping the file, using
vnode_create_vobject. The objects do not share the pages (I do not think
this can be easily achieved without serious changes to VM). Thus most,
if not all, the data is present in two sets of pages.

When such file is written to, tmpfs copies user buffer both to the swap
object, and to the v_object.

Patch I posted assigns the swap object to the vp->v_object. I had to
make small change to vm_mmap_vnode() to not allocate the vnode pager
and to not increment vnode use counter when v_object is the swap
object.

vnp_size has to be provided on the object layer because our swap
object is used to e.g. mmap the executables from tmpfs, and image
activation code relies on vnp_size instead of slower VOP_GETATTR().
I think this route is easier then converting all vnp_size users to
VOP_GETATTR for only tmpfs benefit.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20100621/2f1ef8f8/attachment-0001.pgp


More information about the freebsd-fs mailing list