COW and mprotect on non-shared memory

Ed L Cashin ecashin at uga.edu
Wed Aug 6 09:31:07 PDT 2003


Hi.  I've noticed that in FreeBSD, the struct vm_map_entry has an
eflags member that can have the MAP_ENTRY_COW bit set.

In the vm_map_protect function, which is used by mprotect, it looks
like this bit is used to determine whether or not to set the page
table entries for write access or not:

		if (current->protection != old_prot) {
#define MASK(entry)	(((entry)->eflags & MAP_ENTRY_COW) ? ~VM_PROT_WRITE : \
							VM_PROT_ALL)

			pmap_protect(map->pmap, current->start,
			    current->end,
			    current->protection & MASK(current));
#undef	MASK
		}

... so if this vm_map_entry describes a VM region that is set for COW,
then the page table entries will not allow writes.  If it's not COW,
though, the page table entries will be set to allow writes.  Is that
correct so far?

The reason I'm interested in this is that I'm doing some VM work on
Linux, where they might have COW pages sprinkled throughout a VM
region.  The Linux VM region descriptor analogous to FreeBSD's
vm_map_entry is the vm_area_struct, and it doesn't have any special
bit for COW.  

In Linux, COW is recognizable only by the situation where for a given
page in a region of VM, the vm_area_struct has the VM_WRITE bit set
and the page table entry is write protected.

For that reason, when you mprotect an area of non-shared, anonymous
memory to no access and then back to writable, Linux has no way of
knowing that the memory wasn't set for COW before you make it
unwritable.  It goes ahead and makes all the pages in the area COW.

That means that if I do this:

    for (i = 0; i < n; ++i) {
      assert(!mprotect(p, pgsiz, PROT_NONE));
      assert(!mprotect(p, pgsiz, PROT_READ|PROT_WRITE|PROT_EXEC));
      p[i] = i & 0xff;
    }

... I get n minor page faults!  Pretty amazing, but I guess they
figured nobody does that.  

More surprising is that the same test program has the same behavior on
FreeBSD.  (At least, the "/usr/bin/time -l ..." output shows the
number of page reclaims increasing at the same rate that I increase
the value of n in the loop.)  

I thought that in FreeBSD any COW area would have its own vm_map_entry
with the MAP_ENTRY_COW bit set.  That way, you could run this test
without any minor faults at all.  Now I suspect I was incorrect.
Could anyone help clarify the situation for me?

Thanks.

-- 
--Ed L Cashin            |   PGP public key:
  ecashin at uga.edu        |   http://noserose.net/e/pgp/



More information about the freebsd-hackers mailing list